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Preface 


It is hard to overstate the importance of science and technology in modern 
society. Humans have been reshaping their lives and environment with tech- 
nology since the dawn of the species. The first stone tools and controlled use 
of fire even appear to predate Homo sapiens. There are countless technological 
touchstones along the path of human history that have fundamentally changed 
how we live: agriculture, metallurgy, alphabets, and so on. However, the rate 
of technological change has substantially increased in recent centuries. At the 
dawn of the Industrial Revolution, the steam engine, with its many impacts 
from railroads to factories, was at the forefront of social change. Subsequent 
technological advances continued to transform society. Imagine a world today 
without refrigeration, electric power, vaccinations, airplanes, plastics, or com- 
puters. In the early 21st century, the work of Silicon Valley was perhaps the 
most public image of technology. However, technology encompasses a much 
broader range of tools and techniques that humans employ to achieve goals. 

Given this fundamental role of science and technology in our lives, relatively 
little public discussion is focused on science and technology policy. Politicians 
frequently declare support for science research and technology development 
but have much less to say regarding exactly how or what innovation should be 
encouraged or avoided. 

It is equally surprising how little of science and engineering training is 
spent on the social impacts of research and development. Most engineering 
students in the US are required to take a token course in engineering ethics, 
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often a business ethics course tailored to engineers, which is treated separately 
from the technical coursework. Meanwhile, the physical sciences frequently 
require no training at all, while biological and social sciences usually require 
a short course in the appropriate treatment of animal or human test subjects. 
Some federal research grants require training in the responsible conduct of 
research—often weakly implemented (Phillips et al. 2018)—which focuses on 
the research and publication process. What happens after publication is given 
only cursory attention. 

Few people would argue that scientists and engineers bear absolutely no 
responsibility for how their work is used. Yet the potential use of research can 
be difficult to predict, so it is also hard to argue that they bear total responsibil- 
ity. So what is their level of culpability and what should they do? As with other 
tough questions without clear answers, the typical result is to politely ignore 
the issue or label it someone else’s jurisdiction. However, inaction comes with a 
price. Scientists and engineers, well-trained and comfortable in the lab or field, 
will occasionally find themselves under public scrutiny with inadequate train- 
ing in science policy and risk analysis. 

A particularly stressful scenario is when researchers announce or publish 
work only to receive a decidedly negative public reaction. No one wants their 
dedicated efforts, in which they take great pride, to be seen as dangerous 
science, but it happens. Research can be viewed as dangerous in either its 
practice or in its results. This danger can be broad. Some technologies may 
pose physical danger to humans or the environment. Other technologies are 
morally dangerous—they violate a common societal value, make crossing an 
ethical line easier, or simply cause more harm than benefits. 

I hesitated to call this book “Dangerous Science’ because I did not want to 
alienate the intended audience with an alarmist and vaguely anti-science- 
sounding title. However, terms such as ‘controversial research or ‘unpopular 
technology’ do not quite capture the full impact of science and technology that 
meets public opposition. This book is intended for scientists and engineers, and 
it is important for this audience to understand that the science and technology 
that they work on could be dangerous on many levels. For example, real or 
perceived dangers to society can translate into real dangers to the careers of 
individual scientists and engineers. 

The book’s subtitle is just as important. This is an introductory, but not sim- 
plistic, guide to science policy and risk analysis for working scientists and engi- 
neers. This book is not for experts in science policy and risk analysis: it is for 
the biologist considering synthetic biology research; it is for the computer sci- 
entist considering autonomous military equipment development; it is for the 
engineers and atmospheric scientists considering geoengineering responses to 
climate change. The public has already found some of the research within these 
fields objectionable, and it is wise to enter the policy arena prepared. 

Scientists and engineers must be cognizant of cultural sensitivities, ethical 
dilemmas, and the natural potential for unwarranted overconfidence in our 
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ability to anticipate unintended harm. This task can be both simple, yet dif- 
ficult. On one hand, the writings and conversations of many scientists are often 
deeply self-reflective and nuanced. Yet, the fundamental core of science and 
engineering is empirical, analytical, and often reductionist—traits that can 
work against making connections between technology and society. 

Science is often used to make public policy, and there is a perennial effort 
to increase ‘evidence-based’ public policy (Cairney 2016). Although they are 
related tasks, this book does not focus on how to use science to make good 
policy but rather how to use policy to make good science. Specifically, it 
explores the idea of dangerous science—research that faces public opposition 
because of real or perceived harm to society—and why debates over controver- 
sial research and technology are not easily resolved. More importantly, it also 
suggests techniques for avoiding a political impasse in science and technology 
policymaking. The target audience of this book is future or working scientists 
and engineers—people who care deeply about the impact of their work but 
without the time to fully explore the fields of science policy or risk analysis (it 
is hard enough being an expert in one field). 

The French polymath Blaise Pascal was one of many authors to have noted 
that they would have preferred to write more briefly if only they had the time 
to shorten their work. Given that the seeds of this book were formed about a 
decade ago, I’ve made considerable effort to distill this work down to some- 
thing not overly imposing to the reader while avoiding the oversimplification 
of complex issues. The intent here is not to drown the reader in minutiae, but 
rather to lead the reader through an overview of the difficulties of assessing and 
managing science and technology with ample references for further reading 
as desired. The hope of such brevity is that it will actually be read by the many 
busy professionals who need to consider the societal impact of potentially dan- 
gerous science and do not want to find themselves unprepared in the middle of 
a political maelstrom. 

While a primary audience of the book is the graduate student looking for 
a supplement to standard ‘responsible conduct of research’ required reading 
(e.g., Institute of Medicine 2009), the book is also written to be approachable 
by anyone interested in science policy. If there is one overarching lesson to be 
taken from this book, it is that science in the public interest demands public 
involvement. Science and technology have become too powerful to engage in 
simple trial and error experimentation. Before action, thoughtful consideration 
is required, and this benefits from as many ideas and perspectives as possible. 
Oversight must now be a communal activity if it is to succeed. 

The general form of the book is laid out in the following sequence. 

In the first chapter, a case study is presented that walks through the events 
and fundamental issues in one dangerous science example. In this case, it was 
the debate, starting in 2011, over gain-of-function research involving the H5N1 
avian influenza virus that sparked public fears of a potential accidental pan- 
demic. The description of the multi-year debate demonstrates the practical 
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difficulties of assessing and managing dangerous science. It ends with the ques- 
tion of why a formal risk-benefit analysis commissioned for the debate failed to 
resolve the controversy. 

In the second chapter, we tackle one part of that question and review the 
many ways in which the benefits of research can be defined. In addition, com- 
paring the various methods of estimating the benefits of research can provide 
insight into how science policy is formulated. We review data-driven meth- 
ods of assessing research benefits, including estimating the effects of research 
on job production, economic growth, scientific publications, or patents. More 
subjective methods, such as value-of-information analysis and expert opinion, 
have also been recommended to account for less quantifiable benefits and pub- 
lic values. A comparison of the various legitimate, but essentially incomparable, 
ways that research benefits are assessed suggests that no form of assessment can 
be both quantitative and comprehensive. Discussing the strengths and weak- 
nesses of each approach, I argue there is currently no reliable or universally 
acceptable way of valuing research. The result is that formal assessments of 
research benefits can be useful for informing public science policy debates but 
should not be used as science policy decision criteria. 

In the third chapter, we tackle the other half of the risk-benefit debate by 
reviewing the many factors that can compromise the perceived legitimacy of a 
risk assessment. Formal risk assessment is often idealized as objective despite 
many warnings that subjective value judgments pervade the risk assessment 
process. However, prior warnings have tended to focus on specific value 
assumptions or risk assessment topics. This chapter provides a broad review 
of important value judgments that must be made (often unknowingly) by an 
analyst during a risk assessment. The review is organized by where the value 
judgments occur within the assessment process, creating a values road map in 
risk assessment. This overview can help risk analysts identify potentially con- 
troversial assumptions. It can also help risk assessment users clarify arguments 
and provide insight into the underlying fundamental debates. I argue that open 
acknowledgment of the value judgments made in any assessment increases its 
usefulness as a risk communication tool. 

In the fourth chapter, we acknowledge that policy formulation for controversial 
science and technology must often occur in the absence of convincing evidence. 
As a result, technology policy debates frequently rely on existing technological 
risk attitudes. I roughly categorize these attitudes as either technological 
optimism or skepticism and review multiple theories that have been proposed 
to explain the origins of these attitudes. Although no individual theory seems 
to provide a complete explanation so far, we do know that technological risk 
attitudes are flexible and influenced by a complex range of factors that include 
culture and personal circumstances. An important result of these opposing 
attitudes is that moral arguments against dangerous science are often down- 
played and policymakers tend to act cautiously permissive. Several emerging 
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technologies, such as human genome editing, synthetic biology, and autono- 
mous weapons, are discussed in the context of technological risk attitudes. 

In the fifth and last chapter, we turn to potential solutions for managing 
science controversies. After briefly reviewing traditional risk management 
techniques, I argue dangerous science debates should place less emphasis on 
attempting to quantify risks and benefits for use as a decision tool. Rather risk- 
benefit assessments are better used as risk exploration tools to guide better 
research design. This is accomplished by engaging multiple perspectives and 
shifting away from traditional safety and security measures toward more inher- 
ently safe research techniques that accomplish the same goals. The application 
of these principles are discussed in the example of gene drive technology. 

One final remark on the book’s contents: There is plenty of ammunition in 
this book for science ‘denialists’ if they engage in cherry-picking. Despite the 
critiques of particular lines of research or methods presented here, this book 
is not an attack on the enterprise of science, which has incalculable practical 
and intellectual value to society. Generally, more science is better. However, 
it is antithetical to the progress of science to take the authoritarian approach 
of ‘You're either with us or against us’ and to avoid all valid criticisms of how 
science is conducted. The purpose here is to improve the process for assessing 
and managing the broader impacts of science. Open discussion is the only 
way forward. 


Case Study: H5N1 Influenza 
Research Debate 


Media coverage of the latest scientific discoveries and technological innovations 
is usually enthusiastic. Long-standing questions are answered, productivity is 
increased, health is improved, and our standard of living is raised—all due to 
modern science and human ingenuity. However, sometimes the results of research 
can also inspire public fear and outrage. Let us consider a recent example. 

In September 2011, Dutch virologist Ron Fouchier announced at a confer- 
ence in Malta that his research team had recently engineered a version of the 
HS5N1' avian influenza virus that was highly transmissible between mammals. 
Shortly thereafter, virologist Yoshihiro Kawaoka of the University of Wisconsin 
presented results from a similar study. The subsequent media coverage scared 
the general public and set off demands for a review of how science research is 
assessed, funded, and managed. 

Why would these studies be so scary? The reason centers on the lethal poten- 
tial of influenza and the H5N1 virus in particular. An influenza pandemic typi- 
cally occurs when a flu virus substantially different from circulating viruses 
mutates to become easily transmissible between humans. The combination of 


! The formal naming convention for influenza viruses includes the antigenic type (A, 
B, or C); the originating host (e.g., swine); the geographical origin; year of identifica- 
tion; strain number; or for type A viruses, the specific subtypes of two surface pro- 
teins, hemagglutinin (H) and neuraminidase (N) (e.g., H5N1) (Assaad et al. 1980). 
Given the complexity of formal names, popular shorthand names, such as Spanish 
flu, Swine flu, and H1N1 can potentially be referring to the same influenza virus. 
The World Health Organization has been working to improve shorthand names to 
make them less stigmatizing and more informative. 
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limited natural immune response and quick transmission is dangerous. The 
1957 and 1968 influenza pandemics killed about 0.1 percent of the individu- 
als infected, which resulted in over one million fatalities worldwide in each 
case. The relatively mild 2009 influenza pandemic’s mortality rate was half that. 
The infamous 1918 influenza pandemic, which killed tens of millions world- 
wide, had an estimated lethality of 2 to 5 percent. Although highly uncertain, 
estimates of H5N1 lethality range from 1 to 60 percent (Li et al. 2008; Wang, 
Parides & Palese 2012). The few cases of H5N1 influenza in humans have been 
primarily attributed to direct contact with birds that harbored the virus. If the 
virus gained the ability to easily transmit between humans while retaining its 
lethality, the results would be catastrophic. This is exactly what the H5N1 stud- 
ies appeared to create and why the public reaction was so negative. Although 
there are many pathogens capable of creating pandemics, the influenza virus 
is exceptional in that it tends to be easily transmissible and cause high rates 
of infection with a virulence that ranges from mild to deadly. For these rea- 
sons, a major influenza pandemic has been placed on the short list of truly 
catastrophic global events that includes nuclear war, climate change, and large 
meteor strikes (Osterholm & Olshaker 2017; Schoch-Spana et al. 2017). 

The response to the H5N1 research announcements shows the difficulty of 
assessing and managing potentially dangerous research. The purpose of the two 
bird flu studies (Herfst et al. 2012; Imai et al. 2012), both funded by the US 
National Institutes of Health (NIH), was to investigate how easily the H5N1 
virus could naturally become a more serious public health threat. However, 
many scientists and security experts became concerned that the papers detail- 
ing the experiments could be a blueprint for skilled bioterrorists. In November 
2011, the US National Science Advisory Board for Biosecurity (NSABB) 
recommended redacting the methodology for each paper. This was the first 
recommendation of publication restriction since the board’s formation in 2005. 
The following month, the Dutch government, based on a law—Council Regu- 
lation (EC) No 428/2009—aimed at weapons nonproliferation, requested that 
Dr. Fouchier apply for an export license before publishing his research. Although 
the license was granted within days, the unprecedented application of the law to 
virology research was shocking to many in the science community. As a result, 
a voluntary H5N1 research moratorium was agreed upon by prominent influ- 
enza research laboratories in January 2012 until new guidelines could be put in 
place. The following month, a review by a panel of experts at the World Health 
Organization (WHO) contradicted the NSABB by recommending full publica- 
tion of the research. The NSABB performed a second review in March 2012 and 
reversed its position by also recommending full publication. Critics viewed this 
sudden change as acquiescence to pressure from the scientific community.’ 


* An insider’s perspective of the NSABB decision process and the pressures it faced 
can be found in Deadliest Enemy (Osterholm & Olshaker 2017). 
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Along with the NSABB reversal, the NIH simultaneously released a new 
policy, US Government Policy for Oversight of Life Sciences Dual Use Research 
of Concern, as guidance for institutional biosafety committees. The four-page 
document clarified what research counted as ‘dual-use research of concern — 
that is, research that has both potential societal benefits and obvious malicious 
uses. Almost a year later, new guidelines were released by the US Department 
of Health and Human Services (HHS) in February 2013 for funding H5N1 
gain-of-function research. In this case, gain-of-function means the purposeful 
mutation of a disease agent to add new functions or to amplify existing unde- 
sirable functions, such as transmissibility or virulence. Among the various new 
requirements, the HHS policy mandated the following: the research address 
an important public health concern; no safer alternative could accomplish the 
same goal; biosafety and biosecurity issues were addressed; and risk reduction 
oversight mechanisms be put in place (Patterson et al. 2013). The review pro- 
cess was extended to H7N9 bird flu research in June 2013. 

While these clarifications would appear to have settled the matter, they did 
not. One reason is, like the various policies before it, the HHS policy did not 
address exactly how to assess the risks and benefits of a research proposal. Back 
in 2006, the NSABB made a similar move when it asked authors, institutional 
biosafety committees, and journal editors to perform a risk-benefit analysis 
before publishing dual-use research, without providing detailed guidance on 
how to perform such an analysis. Five years later, an NIH random survey of 
155 life science journals found that less than 10 percent had a written dual-use 
policy or reported reviewing dual-use manuscripts in the previous 5 years 
(Resnik, Barner & Dinse 2011). Likewise, a 3-year sample of 74,000 biological 
research manuscripts submitted to Nature Publishing Group resulted in only 28 
flagged and no rejected manuscripts for biosecurity concerns (Boulton 2012). 

While it is possible that dual-use research is quite rare, it is more likely the 
research community is just unable to recognize research as dangerous due to 
lack of training and proper guidance (Casadevall et al. 2015). In light of publi- 
cations, such as the papers detailing the synthesis of poliovirus (Cello, Paul & 
Wimmer 2002) and the reconstruction of the 1918 flu virus (Tumpey et al. 
2005), perhaps the H5N1 influenza papers are most notable in that they actu- 
ally started a significant public debate over biosecurity policy (Rappert 2014). 

Thus, it is not surprising that the new HHS policy did not resolve the con- 
troversy, and three papers published in 2014 renewed the debate surround- 
ing gain-of-function flu research. One study made the H7N1 flu strain, which 
was not covered by the new HHS rules, transmissible in mammals (Sutton 
et al. 2014). A second study by Dr. Fouchier’s lab expanded on earlier H5N1 
work (Linster et al. 2014). The third paper, published by Dr. Kawaoka’s lab, 
detailed the engineering of a virus similar to the strain responsible for the 
1918 flu pandemic to argue that another major pandemic could arise from 
the existing reservoir of wild avian flu viruses (Watanabe et al., 2014). Many 
critics were particularly disturbed by this last paper because the University of 
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Wisconsin biosafety review of the proposal failed to classify the work as ‘dual- 
use research of concern’ despite a consensus among biosecurity experts that it 
clearly was. Collectively, these results again raised concerns that an intentional 
or accidental release of an engineered virus from gain-of-function influenza 
research could be the source of a future pandemic—an ironic and deadly self- 
fulfilling prophesy. 


The Emphasis Shifts from Terrorists to Accidents 


Events in 2014 brought the H5N1 debate back to the popular media but shifted 
the primary concern from biosecurity to biosafety. First, an accidental expo- 
sure of multiple researchers to anthrax bacteria was discovered at the US Cent- 
ers for Disease Control and Prevention (CDC) in Atlanta. Shortly thereafter, it 
was announced there had been an accidental contamination of a weak flu sam- 
ple with a dangerous flu strain at another CDC lab that further jeopardized lab 
workers. This led to the temporary closure of the CDC anthrax and influenza 
research labs in July 2014 and the resignation of the head of a bioterrorism lab. 

The final straw may have been the discovery of six vials of live smallpox 
virus in storage at a US Food and Drug Administration (FDA) lab in Bethesda, 
Maryland, in June 2014. After smallpox was globally eradicated in 1980, only 
two secure labs (the CDC in Atlanta, Georgia and the Vector Institute in Novo- 
sibirsk, Russia) were supposed to have smallpox virus samples. The planned 
destruction of these official samples had been regularly debated by the World 
Health Assembly for decades (Henderson & Arita 2014). Finding unsecured 
samples elsewhere was rather disturbing. 

After these events, it became clear that human error was much more preva- 
lent at the world’s top research facilities than previously believed. This prompted 
CDC director Thomas Frieden to suggest closing many biosafety level 3 and 4 
labs.* Unfortunately, there was no definitive list of these labs for public review 
or government oversight (Young & Penzenstadler 2015). However, it was esti- 
mated that the number of labs working with potentially pandemic pathogens 
had tripled in this century. This resulted in the number of reported lab acci- 
dents and the number of workers with access or exposure risk to increase by 
at least an order of magnitude. According to a 2013 US Government Account- 
ability Office assessment, the increased number of labs and lab workers had 
unintentionally increased rather than decreased national risk. 

These mishaps led the US Office of Science and Technology Policy and 
the HHS to impose yet another temporary moratorium on NIH-funded 
gain-of-function research for influenza in October 2014. The moratorium was 


° Biosafety levels (BSL) range from BSL-1 to BSL-4, with the latter having the strictest 
protocols and equipment for handling potentially fatal infectious agents for which 
there are no vaccines or treatment. 
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intended to last until the NSABB and National Research Council (NRC) could 
assess the risks and benefits of these lines of research. 


The Risk-Benefit Assessment 


The moratorium called for ‘a robust and broad deliberative process’ to ‘evaluate 
the risks and potential benefits of gain-of-function research with potential pan- 
demic pathogens’ (OSTP 2014). The proposed multi-step process consisted of a 
series of meetings by the NSABB and NRC. The meetings would first draft rec- 
ommendations on how to conduct a risk-benefit analysis and then evaluate an 
independently performed risk-benefit assessment and ethical analysis (Selgelid 
2016). The primary role of the NRC was to provide additional feedback from 
the scientific community while the NSABB provided the final recommendation 
to the Secretary of HHS and director of the NIH.* A $1.1 million contract was 
awarded to an independent® Maryland biodefense consulting firm, Gryphon 
Scientific, in March 2015 to conduct the risk-benefit analysis. The assessment 
was to be ‘comprehensive, sound, and credible’ and to use ‘established, accepted 
methods in the field’ (NIH 2014). The draft assessment (Gryphon Scientific 
2015), completed in eight months, was over 1,000 pages long. It included a sep- 
arate quantitative biosafety risk analysis, a semi-quantitative biosecurity risk 
analysis, and a qualitative benefits assessment. The main finding of the report 
was the majority of gain-of-function research posed no more risk than exist- 
ing wild-type influenza strains. However, for a strain as pathogenic as the 1918 
influenza, the biosafety risk analysis estimated that an accidental laboratory 
release would result in a global pandemic every 560 to 13,000 years resulting in 
up to 80 million deaths. The biosecurity assessment was harder to quantify but 
estimated a similar risk if the theft of infectious material by malevolent actors 
occurred at least every 50 to 200 years. 

An NRC symposium (NRC 2016) was convened in March 2016 to further 
discuss the risk-benefit assessment and the draft recommendations proposed 
by the NSABB. Comments made during the symposium, as well as public com- 
ments received directly by the NSABB, were generally critical of the Gryphon 
report. Criticisms included claims that both the completion of the report and 
the review process were rushed, the report ignored existing data, the report 
missed alternative methods of infection, and it communicated results in units 
that obscured risk. Some members of the scientific community believed the 


* Yes, that is a lot of acronyms. 

° An independent third-party analysis should not be confused with an impartial anal- 
ysis. Analysts can be influenced by the hiring organization, social pressures, and 
cultural norms. Furthermore, analysts trained and embedded in the field of interest 
are more likely to have personal positions at the outset—it is not easy to be a truly 
impartial expert. 
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report was comprehensive and balanced because parties on both sides of the 
debate were not fully satisfied with the report (Imperiale & Casadevall 2016). 
However, unlike policy debates, the final goal of science is truth-seeking, not 
compromise. Thus, there seemed to be some confusion regarding whether the 
risk-benefit assessment was a technical or policy tool. 

The final NSABB report (NSABB 2016) contained a series of findings and 
recommendations. The report found that most types of gain-of-function 
research were not controversial and the US government already had many 
overlapping policies in place for managing most life science research risks. 
However, not all research of concern was covered by existing policies. More 
importantly, there was some research that should not be conducted because the 
ethical or public safety risk outweighed the benefits. Unfortunately, no specific 
rule of what constitutes unacceptable research could be formulated. Rather, the 
report stated the need for ‘an assessment of the potential risks and anticipated 
benefits associated with the individual experiment in question’ This essentially 
meant that any study in question would require its own ad hoc risk-benefit 
assessment process. The NSABB report provided a short list of general types 
of experiments that would be ‘gain-of-function research of concern’ requir- 
ing additional review, as well as types of experiments that would not, such as 
gain-of-function research intended to improve vaccine production. While this 
provided more clarification than what was available at the start of the process, 
public comments from the NRC and NSABB meetings tended to show a gen- 
eral dissatisfaction with the lack of specificity and vagueness of terms, such as 
‘highly transmissible? In particular, the critics were hoping for a detailed list 
of types of dangerous experiments that should not be conducted. There was 
also concern that the definition of ‘gain-of-function research of concern’ was 
too narrow. Based on the new criteria, it appeared that the original research 
that started the debate—creating a mammalian-transmissible avian influenza 
virus—would not be considered research of concern. Finally, there was also 
fear that assessments would not be conducted by disinterested third parties 
with appropriate expertise. 

In the end, the competing stakeholders had not reached consensus: the 
proponents saw the additional regulatory processes as unnecessary, while the 
critics saw the new recommendations as grossly insufficient. The fundamental 
problem was that the competing experts still did not agree on even the basic 
utility or level of danger associated with the research. Consequently, it was not 
possible to create a defensible quantitative risk-benefit assessment or subse- 
quent science research policy that was endorsed by most of the stakeholders. 

In January 2017, the Office of Science and Technology Policy issued a pol- 
icy guidance document (OSTP 2017) that lifted the gain-of-function research 
moratorium for any agency that updated review practices to follow the NSABB 
recommendations. The NIH officially announced it would once again con- 
sider funding gain-of-function research in December 2017. At the same time, 
HHS issued a six-page guidance document, Framework for Guiding Funding 
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Decisions about Proposed Research Involving Enhanced Potential Pandemic 
Pathogens (HHS P3CO Framework), which outlined the extra layer of consid- 
eration required for future funding decisions for applicable research. The US 
government approved the continuation of gain-of-function research at the labs 
of both Dr. Kawaoka and Dr. Fouchier in 2018 (Kaiser 2019). 

So, over the course of six years, six public NSABB meetings, two public NRC 
meetings, a year-long independent formal risk-benefit analysis, and a three- 
year pause on NIH funding for specific gain-of-function viral research, the 
research community found itself essentially back where it started in terms of 
how to actually assess and manage dangerous research. Proponents considered 
the matter settled, while critics saw the regulatory exercise as a fig leaf for con- 
tinuing business as usual. The only significant improvement was the influenza 
research community was now a bit wiser about the limitations of the science 
policy-making process. 

One could argue that this is a rather harsh assessment of the substantial efforts 
of many individuals, but even if we take the optimistic view that the gain-of- 
function research debate refocused the attention of the scientific community 
on the importance of assessing and managing dangerous science research, then 
how do we explain continuing research controversies? For example, in early 
2018, virologists from the University of Alberta published a paper describ- 
ing the de novo synthesis of an extinct horsepox (Noyce, Lederman & Evans 
2018). Claims that the work could lead to a safer smallpox vaccine were met 
with skepticism considering a safe vaccine already exists and there is no mar- 
ket for a new one (Koblentz 2018). Despite the paper passing a journal's dual- 
use research committee, critics argued that the paper lays the groundwork for 
recreating the smallpox virus from scratch (Koblentz 2017). If reactive ad hoc 
narrow regulations worked, then the scientific research community would not 
continue to be surprised by such seemingly reckless studies on a regular basis. 


A Larger Question 


This case study contains some valuable lessons regarding science policy, the 
responsible conduct of research, and the need to consider the implications and 
public perception of dangerous research. One obvious lesson is that science 
policy-making is a political decision-making process and not simply a matter of 
sufficient data and analysis. This was best summarized by risk expert Baruch Fis- 
chhoff, an NRC symposium planning committee member, when he said, ‘Any- 
body who thinks that putting out a contract for a risk-benefit analysis will tell the 
country what to do on this topic is just deluding themselves’ (NRC 2015). How- 
ever, it is less obvious why this is so. After all, insurance companies have been 
conducting quantitative risk assessments for decades—why is this any different? 

The US government response to the gain-of-function influenza research con- 
troversy was a typical response to dangerous science—an ad hoc directive to 
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assess and monitor research in its early stages without details on how to pro- 
ceed. Strangely, the lack of clear procedures for conducting a risk-benefit assess- 
ment extends to even relatively narrow and uncontroversial research questions 
that commonly come before regulatory agencies, such as medical drug efficacy 
comparisons (Holden 2003). This means that scientists and engineers are often 
performing incomparable assessments (Ernst & Resch 1996). Risk assessments 
clearly have value in so far as they focus attention on the public impact of 
research. However, it is not obvious that merely asking scientists to consider 
the risks and benefits of their work will result in due consideration and com- 
munication of risks in ways that satisfy policymakers and the general public 
(Fischhoff 1995). Risk-benefit analysis is often recommended as a policy mech- 
anism for mitigating technological risk, but it is still unclear what its practical 
value is to policy formulation. This leads to the fundamental question—how do 
we assess the risks and benefits of potentially dangerous science? 


Assessing the Benefits of Research 


To answer the question of how we assess the risks and benefits of dangerous sci- 
ence, it helps to break down the problem. We will start with the assessment of 
benefits—a topic frequently revisited during budgetary debates over government 
funding of research. Trying to assess the benefits of research is a long-standing 
and contentious activity among science policy analysts and economists. 
Government-funded research constitutes less than one third of total research 
spending in the US, but public funding of research does not merely augment 
or even displace private investment (Czarnitzki & Lopes-Bento 2013). Rather, 
public funding is critical to early-stage, high-risk research that the private sector 
is unwilling to fund. The result is a disproportionate contribution of government 
funding to technological innovation (Mazzucato 2011). For example, while the 
private sector funds nearly all the pharmaceutical clinical trials in the US, public 
funding is still the largest source of pharmaceutical basic research. 


Methods of Valuation 
So how do policymakers assess the benefits of publicly funded research? Let’s 
look at some of the most common approaches. 
Research as a jobs program 
In a simple input-output model of research, spending on salaries, equipment, 
and facilities associated with research has an analogous output of research jobs, 


manufacturing jobs, construction jobs, and so forth, but the impact of the 
actual research output is neglected (Lane 2009). While this is a simplistic way to 
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view research, it is popular for two reasons. First, it is relatively quantifiable and 
predictable compared to methods that focus on research output. For example, 
the STAR METRICS! program was started in 2009 to replace anecdotes with 
data that could be analyzed to inform the ‘science of science policy’ (Largent 
& Lane 2012). However, the first phase of STAR METRICS only attempted to 
measure job creation from federal spending (Lane & Bertuzzi 2011; Weinberg 
et al. 2014). A subsequent program, UMETRICS, designed to measure uni- 
versity research effects, used a similar approach by analyzing the same STAR 
METRICS data to determine the job placement and earnings of doctorate 
recipients (Zolas et al. 2015). 

The second reason for the jobs-only approach is because job creation and 
retention is a primary focus of government policymakers. Elected officials may 
talk about the long-term implications of research spending, but the short-term 
impacts on jobs for their constituents are far more relevant to their bids for 
re-election. As US Representative George E. Brown Jr. noted, the unofficial 
science and technology funding policy of Congress is ‘Anything close to my 
district or state is better than something farther away’ (Brown 1999). 

One outcome of viewing research in terms of immediate job creation is 
any research program may be seen as a benefit to society because all research 
creates jobs. However, this ignores that some research, through automation 
development or productivity improvements, will eventually eliminate jobs. 
Likewise, when research funding is focused on the desire to retain science and 
engineering jobs in a particular electoral district, it can diminish the perceived 
legitimacy of a research program. For example, there is a long-standing cynical 
perception of some US National Aeronautics and Space Administration (NASA) 
funding acting as a southern states jobs program (Berger 2013; Clark 2013). 


Econometric valuation 


The jobs-only perspective is obviously narrow, so most serious attempts at 
measuring the benefits of research use broader economic indicators. Econo- 
metric methods have attempted to measure the value of research by either a 
microeconomic or macroeconomic approach. 

The microeconomic approach attempts to estimate the direct and indirect 
benefits of a particular innovation, often by historical case study. The case study 
approach offers a depth of insight about particular technologies that is often 
underappreciated (Flyvbjerg 2006). However, it is time and resource inten- 
sive, and its detailed qualitative nature does not lend itself to decontextualized 


! Science and Technology for America’s Reinvestment - Measuring the EffecTs of 
Research on Innovation, Competitiveness, and Science (a bit of a stretch for an acro- 
nym) 
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quantification.’ Furthermore, actual benefit-cost ratios or rates of return for 
case studies tend to be valid only for the industry and the time period studied. 
Asa result, they can bea poor source for forming generalizations about research 
activities. Additionally, innovation often comes from chance discovery (Ban 
2006), which further complicates attempts to directly correlate specific research 
to economic productivity. 

The macroeconomic approach attempts to relate past research investments to 
an economic indicator, such as gross domestic product (GDP).* This approach is 
more useful for evaluating a broader range of research activities. Using the mac- 
roeconomic approach, the value of research is the total output or productivity of 
an organization or economy based on past research investments. Three impor- 
tant factors have been noted when attempting a macroeconomic valuation of 
research (Griliches 1979): 


1. The time lag between when research is conducted and when its results are 
used defines the timeframe of the analysis. Depending on the research, 
the time lag from investment to implementation may take years or 
decades. 

2. The rate at which research becomes obsolete as it is replaced by 
newer technology and processes should be considered. The knowledge 
depreciation rate should be higher for a rapidly changing technology than 
for basic science research. For example, expertise in vacuum tubes became 
substantially less valuable after the invention of the transistor. Conversely, 
the value of a mathematical method commonly used in computer science 
might increase over time. 

3. There is a spillover effect in research based on the amount of similar 
research being conducted by competing organizations that has an impact 
on the value of an organizations own research. This effect might be 
small for unique research that is unlikely to be used elsewhere. Further 
complicating this effect is the influence of an organization’s ‘absorptive 
capacity’ or ability to make use of research output that was developed 
elsewhere (Cohen & Levinthal 1989). Even without performing substan- 
tial research on its own, by keeping at least a minimum level of research 
capability, an organization can reap the benefits of the publicly available 
research output in its field. 


? This has not stopped big-data enthusiasts from trying. For example, keyword 
text-mining was performed on a 7,000 case study audit of research impacts in 
the United Kingdom. Ironically, the point of the audit was to add complementary 
context to a quantitative assessment (Van Noorden 2015). 

°? While GDP is a popular economic indicator, detractors dislike its simplicity 
which ignores many factors, including inequality, quality-of-life, happiness, and 
environmental health (Masood 2016; Graham, Laffan & Pinto 2018). 
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In general, quantifying any of the above factors is easier for applied research 
than for basic research. Likewise, it is easier to quantify private benefit to a 
particular organization than public benefit. Another factor that prevents easy 
identification of the economic value of research is the general lack of data vari- 
ability. Research funding rarely changes abruptly over time, so it is difficult to 
measure the lag between research investments and results (Lach & Schanker- 
man 1989). 

The most common approach for determining the economic rate of return 
for research is growth accounting where research is assumed to produce all 
economic growth not accounted for by other inputs, such as labor and capi- 
tal. Economists often refer to this unaccounted growth as the Solow residual 
(Solow 1957). A comprehensive review (Hall, Mairesse & Mohnen 2009) of 
147 prior research studies that used either an individual business, an industry, 
a region, or a country to estimate the rate of return of research found a variety 
of results. The majority of studies found rates of return ranging from 0 to 50 
percent, but a dozen studies showed rates over 100 percent*—a wide interval 
that portrays the difficulty of quantifying the benefits of research. 

Not surprisingly, the return on research is not constant across fields, coun- 
tries, or time, so any estimates from one study should be used cautiously 
elsewhere. Likewise, it is important to distinguish general technological pro- 
gress from research. While technological progress may account for most of 
the Solow residual, a non-negligible amount of innovation occurs outside of 
funded research programs (Kranzberg 1967, 1968). Ultimately, due to the many 
potential confounding factors, such as broader economic conditions or politi- 
cal decisions, it is difficult to show a causal relationship for any correlation of 
productivity or profit with research. 

Unfortunately, some advocacy groups have issued reports that imply a simple 
direct relationship between scientific investment and economic growth. Such 
statements are unsupported by historical data (Lane 2009). For example, Japan 
spends a higher proportion of GDP on research than most countries but has 
not experienced the expected commensurate economic growth for the past two 
decades. Likewise, at the beginning of this century, research spending in the US 
was about 10 times higher than in China. As a result, US contribution to the 
global scientific literature was also about 10 times higher than China’s. How- 
ever, in the following decade, Chinas economy expanded 10 times faster than 
the US economy.” The exact relationship between research spending and eco- 
nomic growth remains unclear; the only consensus is that research is beneficial. 


* One outlier study conservatively (or humbly) estimated rates between -606 and 734 
percent. 

$ Ironically, Chinas robust economy allowed it to dramatically increase its research 
spending and eventually surpass the US in total number of science publications 
according to NSF statistics. 
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Valuation by knowledge output 


Given the difficulties of using econometric methods to assess research, econo- 
mists have explored other methods that avoid monetizing research benefits 
and the private versus social benefit distinction. One popular alternative 
is to use academic publications. Despite its relative simplicity compared to 
economic growth, publications are still problematic. First, comparisons are 
complicated because scientific publication is not equally valued among all 
fields and organizations. Publication in prestigious journals is often essential 
to career advancement in academia but is relatively unimportant for indus- 
trial research scientists. Likewise, an ever-increasing proportion of research 
is being disseminated outside standard academic journals via the internet— 
open access pre-print archives, research data repositories, and code-sharing 
sites have all become common. It is unclear how these new modes of infor- 
mation sharing should be measured. Second, the method does not assess the 
relative value or visibility of an individual publication. This issue is partially 
addressed by using the number of citations rather than the number of publi- 
cations. However, citations are not a clear sign of quality research. For exam- 
ple, citations are commonly made to provide basic background information 
(Werner 2015). Similarly, a journal’s impact factor—the average number of 
citations for a journal’s articles in the past year—is widely derided as a proxy 
for research output quality (yet the practice is still shamefully common). Con- 
versely, a lack of citations does not necessarily indicate a lack of social benefit. 
The information in an academic research article may be widely used in non- 
academic publications—reports, maps, websites, and so forth—without ever 
generating a citation that can be easily found. Likewise, online papers that 
have been repeatedly viewed and downloaded, but never cited, may have more 
public value than a cited paper with less internet traffic. Additional shortcom- 
ings are similar to traditional econometric approaches: what is the appropri- 
ate lag time for publications and what time window should be considered for 
counting publications based on the depreciation rate of scientific knowledge 
(Adams & Sveikauskas 1993)? 

Scientists love to create models for complex problems, so it should be no sur- 
prise that a model was created that estimates the ultimate number of citations 
for a particular article (Wang, Song & Barabasi 2013). The model included the 
following characteristics: citations accrue faster to papers that already have 
many citations, a log-normal decay rate for future citations, and a general factor 
that accounts for the novelty and importance of a paper. However, the model 
required 5 to 10 years of citation history to make projections, and the difficulty 
of properly calibrating the model limited its utility (Wang et al. 2014; Wang, 
Mei & Hicks 2014). Conversely, an extensive study that looked at 22 million 
papers published over the timespan of a century in the natural and social sci- 
ences found that citation histories are mixed and unpredictable (Ke et al. 2015). 
Some extreme papers, labeled ‘sleeping beauties; accumulated few citations for 
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decades and then suddenly peaked—presumably because an important appli- 
cation for the research occurred at a much later date. Likewise, some of the 
most novel papers tend to languish for years in less prestigious journals but 
are eventually recognized by other fields for their original contributions and 
eventually become highly cited (Wang, Veugelers & Stephan 2016). Generally 
speaking, using short-term citations as a metric for assessing research is a 
bad idea. 

A similar non-monetary approach for measuring research benefits is to 
count the number of patent citations in a particular field (Griliches 1979; Jaffe, 
Trajtenberg & Henderson 1993; Ahmadpoor & Jones 2017). This method has 
the benefit of better assessing the practical value of research activities and cap- 
turing the technological innovation component of research that is likely to have 
high social benefit. However, this method also shares some of the drawbacks 
of the publication approach as well as a few unique drawbacks of its own. The 
economist Zvi Griliches observed that a US productivity peak in the late 1960s 
was followed by a decline in patents granted in the early 1970s and that both 
events were preceded by a decline in the proportion of GDP devoted to indus- 
trial research spending in the mid-1960s. Whether productivity and patents 
followed a 5- to 10-year lag behind research spending was difficult to determine 
given that among other factors, the number of patents per research dollar also 
declined during that time period, an energy crisis occurred during that time 
period, and other countries suffered similar productivity losses without the 
drop in research funding (Griliches 1994). 

Fluctuations in patent generation may also be due to the national patent 
office itself. For example, the 2011 Leahy-Smith America Invents Act, which 
took effect in 2013, changed the unique US first to invent patent system to a 
more standard first to file system. This makes comparisons before and after the 
new system more difficult. Likewise, as Griliches noted, stagnant or declining 
funding for a patent office could limit the throughput of the department or pre- 
vent it from keeping up with growing patent application submissions. This very 
phenomenon appears to have occurred in the US since the innovation boom of 
the Internet age (Wyatt 2011). 

Ultimately, patents remain a limited and non-representative measure of 
research benefits. There is poor correlation between patents and public benefit 
because most benefits come from a small subset of all patents and only about 
half of all patents are ever used and fewer are ever renewed (Scotchmer 2004). 
Also, not all organizations patent their inventions at the same rate because the 
value of a patent is distinct from the value of the invention (Bessen 2008). Phar- 
maceutical patents can be extremely valuable; whereas, software-related inno- 
vations are more difficult to defensibly patent and are often obsolete before the 
patent is even awarded. A review of the 100 most important inventions each year 
from 1977 to 2004, as judged by the journal Research and Development (R&D 
100 Awards’), found that only one-tenth were actually patented (Fontana et al. 
2013). Most companies relied on trade secrets or first-to-market advantages 
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rather than patents. Patents allow a holder to litigate against infringement, but 
this legal right is often too expensive and time-consuming for all but the largest 
organizations to carry out. Alternatively, a large collection of related patents 
can create a ‘patent thicket? where its primary value is rent-seeking and slow- 
ing competitors, not social benefit. A CDC list of the most important public 
health achievements of the 20th century contained no patented innovations 
(Boldrin & Levine 2008), suggesting patents are indeed a very poor measure of 
research social benefit. Nonetheless, patents are still widely used as a measure 
of research value for lack of a convincing alternative. 

While economists view the measurement of knowledge output to be prob- 
lematic but possible, others believe the problem is intractable or at least not 
quantifiable in any honest way. Philosopher Paul Feyerabend argued that a 
careful study of the history of science shows the truth or usefulness of any par- 
ticular scientific theory or line of research may not be appreciated for decades 
or even centuries (Feyerabend 2011). He gave one extreme example of the the- 
ory proposed by the Greek philosopher Parmenides of Elea (5th century BCE) 
that all matter has the same fundamental nature. The theory was abandoned for 
over 2,000 years before being revived by particle physicists in the 20th century. 
A more recent example is the theory of continental drift, first proposed in 1596 
by Flemish cartographer Abraham Ortelius. The theory was revived in 1912 
by meteorologist Alfred Wegener, who unsuccessfully championed the idea 
for two decades.® After the steady accumulation of supporting evidence, the 
idea was eventually incorporated into the theory of plate tectonics in the 1960s, 
which now serves as a cornerstone of modern geoscience. Perhaps the most 
relevant example is the theory that fossil fuels cause global warming, which 
was first proposed by Swedish scientist Svante Arrhenius in 1896. His work 
was inspired by British scientist John Tyndall’s 1859 work on the radiative heat 
absorption properties of carbon dioxide and water vapor and their likely effects 
on the planet’s surface temperature. Despite winning the 1903 Nobel Prize in 
Chemistry for foundational work in electrochemistry, Arrhenius’ work detail- 
ing the correct mechanism and mathematical relationship between infrared 
absorption and atmospheric carbon dioxide concentrations was largely ignored 
for almost a century before anthropogenic climate change was realized to be an 
unprecedented threat to humanity. 

Even though economists are generally trying to measure the short-term 
societal benefits of more tangible and immediate research, selecting a lag time 
is merely a choice of analytical convenience. There were decades between the 
development of quantum physics and technologies based on quantum theory: 
transistors, lasers, magnetic resonance imaging, and so on. The theory is over a 
century old and yet new technologies, such as quantum computers, are still in 


€ His failure was partly scientific, his observations had no good explanatory mecha- 
nism, and partly social, he was an outsider to the geology community and a German 
World War I veteran. 
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development. It would be hard to argue that these were impractical or unim- 
portant benefits that could be left out of a realistic benefits assessment. It would 
seem even a field of research that has yet to yield useful result-—such as string 
theory (Castelvecchi 2015)—should not be dismissed as long as it still has intel- 
lectual inspirational value; one never knows what is yet to transpire. Likewise, 
how does one measure the benefits of long-term research that may require dec- 
ades to yield significant findings (Owens 2013b). 

Selecting a lag time by a cutoff function that is designed to capture most of the 
citations, patents, or economic growth based on past research is based on the 
questionable assumption that only the intended outcome of applied research is 
of interest. However, the history of technology suggests secondary unintended 
discoveries, both good and bad, are important. For example, in the pharmaceu- 
tical industry, drugs are commonly repurposed when they are unexpectedly 
found to treat a disease other than their intended target. Thus, selecting a time 
period for the evaluation of research may capture some of the intended out- 
comes but miss the secondary serendipitous discoveries (Yaqub 2018). 


Valuation by multiple metrics 


The various metrics discussed so far appear to be poor measures of the social 
benefits of research. They are popular primarily because they make use of the 
available data, not because they necessarily measure the desired outcomes. 
Metrics are frequently pursued with the noble intention of improving account- 
ability and transparency but do not often accomplish either because they tend 
to oversimplify complex processes and create perverse incentives to game the 
system when metrics are used to reward or punish individuals.’ 

For example, if patents become a preferred metric of research productiv- 
ity, some researchers will knowingly generate patents that are of questionable 
licensing value to improve their likelihood of securing future funding. Like- 
wise, the frequent practice of using the number of publications as a metric has 
led to academic complaints about ‘salami-slicing’ research and jokes about 
the ‘least publishable unit? Quantitative assessments of research output in the 
United Kingdom, Australia, and New Zealand may have created the unin- 
tended consequence of pushing researchers away from high-risk basic research 
and toward more conventional, short-term, applied projects to improve their 
rankings (Owens 2013a; McGilvray 2014) History suggests abandoning basic 
research in favor of seemingly more predictable short-term applied research is 


7 The metrics-focused system analysis approach of Secretary of Defense Robert 
McNamara is often blamed for the tragic poor decision-making surrounding the 
war in Vietnam as well as his later missteps as president of the World Bank. Propo- 
nents of metrics often point to successes in far simpler and more quantifiable human 
endeavors, such as baseball (Muller 2018). 
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probably counterproductive. For example, how could one predict that the germ 
theory of disease developed in the 19th century would be the impetus for the 
modern sanitation techniques responsible for much of the increase in average 
life expectancy in the 20th century? A review of almost 30 years of biomedi- 
cal research grants found that basic and applied research were equally likely 
to be cited in patents (Li, Azoulay & Sampat 2017). Of course, the underlying 
observation is not new. Abraham Flexner first made the argument that basic 
research yields important social benefits in his 1939 essay, The Usefulness of 
Useless Knowledge. It appears the message requires frequent repetition. 

Despite these critiques, there has been some hope that using a family of 
complementary metrics would yield an improved estimate over individual 
research measurements. For example, a combination of publication citations to 
capture basic research and patents to capture technology development might 
appear to be a complementary set of measurements. The STAR METRICS 
program was created to measure the impact of US federally funded research 
using a multi-dimensional approach. Some of the proposed indicators included 
(Federal Demonstration Partnership 2013): 


e number of patents; 

e number of start-up companies; 

e economic value of start-up companies over time; 
e future employment of student researchers; 
impacts on industry from research; 

e the number of researchers employed; 

e publications and citations; and 

e long-term health and environmental impacts. 


While the STAR METRICS approach avoided some of the limitations of indi- 
vidual metrics previously discussed, it was questionable how many of the pro- 
posed metrics could be measured in practice or how representative the final set 
of metrics would be. Given the difficulty of the task, it was not surprising when 
the full implementation of STAR METRICS program was abandoned in 2015. 

A more successful program, the Innovation Union Scoreboard, has been 
evaluating the research efforts of European Union member states since 2007 
(Hollanders & Es-Sadki 2014). It encompasses 25 metrics, including multiple 
indicators for educational outcomes, scientific publications, patents, public 
research investments, private research investments, employment, and other 
economic indicators. As with similar programs, the Innovation Union Score- 
board is by necessity restricted to indicators for which there are data. As such, 
unquantifiable benefits are missed. 

Despite the difficulties of quantitatively valuing research, the era of big-data 
has inspired an entire alphabet soup of research assessment systems, none of 
which can be easily compared to each other. Detractors have argued that these 
broad quantitative measurement tools are just as non-representative and easily 
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gamed as the many popular, but widely derided, college ranking schemes. It 
has yet to be seen if any of these multi-metric systems will improve research or 
how—outside their own definition—success will be determined. However, the 
rush to quantitative assessment is not universal. The Chinese Academy of Sci- 
ences moved away from an existing 24 indicator multi-metric research ranking 
system to a qualitative system based on peer review (Kun 2015). The motiva- 
tion was a desire to place emphasis on the real social value of research rather 
than on easily measured surrogates. 


Value-of-information analysis 


Econometric methods would appear to be the obvious choice for perform- 
ing a research cost-benefit analysis. However, as previously discussed, this is 
a difficult task even for research that has already been conducted. Estimating 
the value of future research is even more uncertain as it requires the question- 
able assumption that the future will be much like the past. This is a difficult 
assumption to defend because history shows the progress of technology to 
be inconsistent and unpredictable. Computer technology has exceeded most 
predictions made in the 20th century, yet utilities powered by nuclear fusion 
have stubbornly remained a technology of the future. Unfortunately, there is no 
consistent set of criteria that will predict whether a particular research project 
will succeed. The list of contributing factors is extensive, and there is even disa- 
greement among studies regarding the magnitude and direction of influence of 
each factor (Balachandra & Friar 1997). 

For future research decisions, an alternative to traditional econometric or 
knowledge output approaches is to use value-of-information (VOI) analysis. In 
VOI, the value of the research is measured by estimating its expected value to a 
particular decision and weighing it against the cost of obtaining that informa- 
tion (Morgan, Henrion & Small 1990; Fischhoff 2000).° For example, knowing 
the transmissibility of a particular pathogen has value for public health officials 
in their decision of how to prepare for future pandemics. This value can be 
measured in any agreeable units—money, lives saved, response time, and so 
on. The primary strength of this approach is that it deals directly with the value 
of research to the decision maker (Claxton & Sculpher 2006). By comparison, 
high quality research, as measured by knowledge output methods, has no clear 
correlation to societal benefit only an assumed link. Because VOI is a for- 
ward-looking predictive method of valuation rather than a backward-looking 


* VOI literature often uses the term ‘expected value of perfect information; which is 
simply the difference between the value of the decision made with complete infor- 
mation compared to existing information. Restated, this is the value of removing 
uncertainty from the decision process. 
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reflective method, it sidesteps the issue of making comparisons between past 
and future research. 

Another strength is that VOI analysis is a more theoretically complete and 
consistent method of research valuation. Performing a cost-benefit analysis 
using a family of economic, knowledge, and social metrics can use collected 
data, but that data will generally be an incomplete measure of the total value of 
research and will often consist of proxies for the characteristics we would prefer 
to measure. Conversely, a VOI approach can place a direct value on factors that 
are difficult to monetize: aesthetic, intellectual, or even the cultural significance 
of a scientific discovery. Thus, VOI is complete in the sense that any recognized 
benefit can be included in the analysis. 

However, the thoroughness of the VOI approach comes at the price of sub- 
jective estimates and value judgments. VOI is a productive decision tool only 
when one can reasonably estimate the value of obtaining the information. For 
that reason, VOI is often applied to business, engineering, and applied science 
decisions (Keisler et al. 2013). For example, VOI would be useful for estimating 
whether a particular medical test has value for decisions about patient treat- 
ment. However, it is harder to use VOI for estimating the value of highly uncer- 
tain basic research. VOI is subjective when it measures subjective things. It 
cannot create certainty out of uncertainty. 

The thoroughness of the VOI approach also complicates analysis due to 
the broader array of potential social benefits that might be considered. While 
VOI is simple in concept, it can be quite complex in practice. For this reason, 
the VOI approach is often used in conjunction with an influence diagram—a 
visual representation of a decision process that represents variables as nodes 
and interactions between variables as arrows (Howard & Matheson 2005). The 
influence diagram serves as a visual aid to elucidate and organize the often 
complex interaction of factors that can affect the value of basic research. How- 
ever, an influence diagram with more than a dozen or so nodes and arrows 
tends to become an unreadable labyrinth that provides little insight. 

As an example, Figure 1 shows the relation among the various ways in which 
research can be valued as an influence diagram. Each form of valuation is 
represented as a node with arrows indicating if the method informs another 
method. For example, job creation is often concurrent with economic growth 
(but not always), so we would expect these two research valuation methods to 
be closely related. Likewise, both jobs and economic growth can be used in a 
multi-metric approach or in an expert opinion approach. Knowledge output, in 
the form of citations and patents, can also be used in a multi-metric approach 
and is similar to the VOI approach in that both are non-monetary and can 
more easily characterize the value of basic research with no immediate prac- 
tical applications. Expert opinion, discussed in the next section, is the most 
comprehensive approach in that it can make use of all the other methods of 
valuation. However, in practice, expert opinion can range from superficial to 
comprehensive. 
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Figure 1: Methods of valuing research. 


Although less formal than the VOI approach, a similar process can be used 
to reconcile the supply and demand for science research (Sarewitz & Pielke Jr 
2007). This is done by collecting the information required by policymakers (the 
demand) through workshops, surveys, interviews, and committees. Using the 
same process, regular assessments are made regarding whether the research 
(the supply) is actually being used. Rather than placing a common value on 
the information, the intent is only to re-align research priorities to maximize 
social benefit. In theory, this is a great idea because useful science can happen 
by accident, but more useful science will happen when it is done with purpose. 
The priority re-alignment process is much less subjective than VOI in the sense 
that it does not attempt to quantitatively compare research programs. However, 
it is also time consuming in that it seeks input from all stakeholder groups and 
can be difficult to complete when contentious issues preclude a consensus on 
the demand for science. Furthermore, it is difficult to predict what research 
will actually yield the most social benefit; focusing only on immediate applied 
research would miss important basic research that eventually yields important 
technology. 

In standard VOI literature, benefit is derived from additional knowledge, 
and it is assumed that the value of information can never be negative because 
a decision maker can always choose to ignore low-value information (Black- 
well 1951). However, experiments suggest decision makers are often unable to 
ignore unhelpful information once it is known due to a ‘curse of knowledge’ 
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(Camerer, Loewenstein & Weber 1989).? Furthermore, decision makers are 
often unaware when information is unhelpful based on their surprising will- 
ingness to pay for unhelpful information (Loewenstein, Moore & Weber, 2003). 
This questions the basic assumption that the value of information is never neg- 
ative because it can be ignored without cost. 

We can extend this concept of negative value of information to include 
research that may yield knowledge that has potential public harm, such as dual- 
use research that has obvious use by military, terrorists, or criminals. Without 
the negative VOI concept, research cannot be any worse than wasted effort. 
With the idea of negative VOI, some research programs may yield informa- 
tion we might prefer not to know or find morally objectionable (Kass 2009). 
Likewise, some research might harm the public because it is erroneous. For 
example, a 1998 Lancet paper linked the MMR vaccine with autism. Although 
later discredited and retracted, it fueled suspicion regarding the safety of child- 
hood vaccination; subsequent outbreaks of preventable diseases and multiple 
fatalities occurred in communities that disproportionately avoided vaccination 
(Gross 2009). 


Qualitative assessment by expert opinion 


A 1986 US Office of Technology Assessment report reviewed a variety of quan- 
titative methods for determining the value of research and the prevalence of 
such methods in industry and government. The report found that the majority 
of managers preferred ‘the judgment of mature, experienced managers’ as the 
best method for assessing the value of research (OTA 1986). Formal quanti- 
tative models were perceived to be misleading due to their simplistic nature, 
which missed the complexity and uncertainty inherent in the decision-making 
process. 

Given the issues with various quantitative methods as previously described, 
it is not surprising that expert opinion is still the gold standard in estimating 
the value of research. However, qualitative expert review is also problematic. 
Two fundamental difficulties with using expert opinion are conflicts of inter- 
est and unavoidable bias. Specialists are usually employed within their field 
of expertise, which leads to a weak, but pervasive, financial conflict of inter- 
est. Likewise, people tend to attach the most value to activities on which they 
have spent the most time. This phenomenon, referred to as effort justification 


°? An example of the curse of knowledge occurs in teaching. It is extremely difficult 
to imagine one’s own state of mind before a concept was understood. This leads to 
teachers often overestimating the clarity of their instruction and the comprehension 
in their students (Weiman 2007). 
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(Festinger 1957) or the IKEA effect (Norton, Mochon & Ariely 2012)—because 
people tend to value an object more when they assemble it themselves—can 
lead experts to unintentionally overestimate the value of the research with 
which they have been most involved. Even the appearance of conflict between 
what is in the best interest for the general public versus the experts themselves 
decreases credibility and can make research assessment discussions look like 
special interest lobbying. 

One way to partially compensate for potential expert bias is to actively seek 
competing views. Philosopher Philip Kitcher recommends an ‘enlightened 
democracy where well-informed individuals selected to broadly represent 
society set science research agendas (Kitcher 2001). This ideal is set as a middle 
ground between a ‘vulgar democracy; where science suffers from the ‘tyranny 
of the ignorant; and the existing system, where a struggle for control over the 
research agenda is waged between scientists (internal elitism) and a privileged 
group of research funders (external elitism). Some influences on the science 
research agenda, such as focused lobbying by well-informed advocates, defy 
this idealized distinction between a scientific elite and an uninformed pub- 
lic. Nonetheless, the struggle to maintain a balanced and representative set of 
research policymakers is real. 

One example of this struggle was the President’s Science Advisory Com- 
mittee created by US President Eisenhower to provide cautious science policy 
analysis during the American pro-science panic that occurred after the launch 
of Sputnik in October 1957. The Committee’s criticism of President Kennedy’s 
manned space program and President Johnson’s and President Nixon’s mili- 
tary programs led to its ultimate demise in 1973 (Wang 2008). The subsequent 
Office of Technology Assessment served in a similar role for the US Congress 
but faired only marginally better lasting from 1972 to 1995. It attempted to 
maintain neutrality by only explaining policy options without making explicit 
recommendations. However, its general critique of President Reagan’s Strategic 
Defense Initiative—mockingly called Star Wars—created conservative antipa- 
thy that eventually led to its demise. Suggestions have been made on how to 
make such science advisory bodies more resilient (Tyler & Akerlof 2019), but 
these anecdotes suggest that balanced counsel on science policy can be difficult 
to maintain. 

Compared to quantitative methods, assessment by expert opinion is time- 
consuming and expensive. The tradeoff is supposedly a better assessment. 
Unfortunately, the historical record is less than convincing For example, the 
National Science Foundation (NSF) uses peer review panels to assess research 
proposals based on the significance of goals, feasibility, the investigator's track 
record, and so on, but the process may not be capable of predicting even rela- 
tive rankings of future research impact. For a study of 41 NSF projects funded 
a decade prior, panelists’ predictions of future success were found to have no 
significant correlation with the actual number of publications and citations 
coming from each funded project (Scheiner & Bouchie 2013). 
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While expert panels are frequently used with the idea that group decisions 
are better than individual reviews, scientists are not immune to social dynam- 
ics that hinder good decision-making. Non-academics or other outsiders can 
be sidelined, dominant personalities or senior scientists may expect deference, 
or the panel may engage in groupthink. Larger studies of the NIH peer review 
process have found that there is no appreciable difference between high- and 
low-ranked grant proposals in their eventual number of publications per grant, 
number of citations adjusted for grant size, or time to publication (Mervis 
2014). However, a study of 137,215 NIH grants awarded between 1980 and 
2008 found that the highest-rated grant proposals yielded the most publica- 
tions, citations, and patents such that a proposal with a review score one stand- 
ard deviation above another generated 8 percent more publications on average 
(Li & Agha 2015). Critics have questioned the cause of this correlation consid- 
ering journal publications are also based on peer-review; thus, any correlation 
may only indicate measurement of the same reputational system. 

The journal peer-review system was the subject of another study that 
followed the publication history of 1,008 submissions to 3 top medical jour- 
nals (Siler, Lee & Bero 2014). Of the 808 manuscripts that were eventually 
published, the lowest-rated submissions tended to receive the least eventual 
citations. However, the top 14 papers were all rejected at least once, which 
suggests the most innovative high-impact work is often unappreciated by the 
peer-review process." 

Perhaps the most damning critique of expert opinion comes from the many 
examples throughout history of substantial scientific research that went unap- 
preciated by experts to an extent that is almost comical in hindsight. For 
example, biologist Lynn Margulis’ paper proposing that mitochondria and 
chloroplasts in eukaryotic cells evolved from bacteria (Sagan 1967) was origi- 
nally rejected by over a dozen journals. Over a decade later, DNA evidence 
confirmed the theory and Dr. Margulis was eventually elected to the National 
Academy of Sciences and given various awards, including the National Medal 
of Science. In another example, materials engineer Dan Shechtman needed 
two years to get his paper identifying the existence of quasicrystals published 
(Shechtman et al. 1984). He was met with ridicule from the scientific commu- 
nity and was even asked to leave a research group. This work eventually earned 
Dr. Shechtman the Nobel Prize in Chemistry in 2011. 

The imprecision of peer review should come as no surprise to anyone who 
has published in the academic literature for some time. It is not uncommon to 
receive multiple reviewer comments that make contradictory assessments of a 
manuscript’s quality or request mutually exclusive changes. Critiques of peer 
review have been common since its inception (Csiszar 2016), and there have 


1 While most rejections were desk rejections (that is rejections by the journal editors), 
this is, in practice, part of the peer-review process. This conservatism may be a sign 
of ‘normal science in action (cf. Kuhn 1962). 
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been many attempts to improve the process: abandoning anonymous reviews 
to improve accountability, publishing reviews to improve transparency, using 
double-blind reviewing to remove bias for or against the author’s reputation 
or publication history, or even awarding grants by random lottery to propos- 
als that meet established quality standards. Some of the calls for reform are in 
fundamental conflict with each other—some want to fund projects not pedi- 
grees, while others want to fund people rather than projects—and each side has 
a plausible argument. While these changes may make the process fairer, it is 
unclear if they also improve the ability of experts to assess the long-term merit 
of research. Ultimately, we are left with the likelihood that expert opinion is the 
worst way to assess the benefits of research, except for all the other methods. 


Implications for Assessing the Benefits of Research 


A comparison of the most common ways policymakers assess the benefits of 
research provides some insight into science policy. Figure 2 shows the vari- 
ous approaches previously discussed ordered from the narrowest to the broad- 
est conception of benefits. This also corresponds to ordering from the most 
objective to most subjective. That is, assessing the job creation potential of a 
research program is comparatively objective and data-driven, while expert 
opinion requires considerable use of subjective estimates and value judgments. 
The choice of approach depends on the purpose of the assessment. One can use 
these various methods to obtain answers that are either objective and incom- 
plete or comprehensive and subjective but not objective and comprehensive. 
For example, in the H5N1 virus case study in the previous chapter, the ben- 
efits of research using potential pandemic pathogens are highly influenced by 
social factors suggesting that a broader conception of benefits is more appro- 
priate for assessment. Specifically, influenza research is most useful for regions 
that have functional public health systems. Given the uneven distribution of 
basic public health services in the world, any research benefits are far more 
limited in extent than in an ideal world. The problem is further exacerbated by 
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Figure 2: Ways of assessing the benefits of research ordered by increasing 
completeness of benefits that can be considered and also increasing uncertainty 
and subjectivity of estimates. 
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the frequent regression of public health services in regions experiencing war 
and failed governments. The sad reality is most people in the world have no 
access to an influenza vaccine of any kind. Since past influenza pandemics 
were not recognized in their early stages, the likelihood that tailored vaccines 
can be quickly distributed worldwide is small. These factors undermine the 
immediate practical health benefits of this research. While this nuanced view 
of science research benefits is useful, it increases the difficulty of quantification 
and the uncertainty of the assessment. 

Upon reflection, we can see that some types of research are more amenable to 
particular forms of assessment. This suggests scientists involved in ‘blue skies’ 
basic research that has only job creation as an immediate quantifiable benefit 
should avoid getting locked into an econometric valuation debate. When basic 
science is treated as a mere economic engine, the weaknesses rather than the 
strengths of curiosity-driven research are emphasized, resulting in weak justi- 
fications. Rather, basic science should be honestly argued on intellectual, aes- 
thetic, and even moral grounds if support from the general public is expected. 

For example, in 1970, Ernst Stuhlinger, a scientist and NASA administra- 
tor, responded to a letter from Sister Mary Jucunda. Given the plight of starv- 
ing children in Africa, she questioned the expenditure of billions of dollars for 
manned space flight (Usher 2013). Stuhlinger’s response is an eloquent defense 
of the value of research in general but a rather weak defense of space explo- 
ration based on several proposed practical benefits—none of which are actu- 
ally dependent on manned space flight: satellite data to improve agricultural 
output, encouraging science careers, increasing international cooperation, and 
serving as a more benign outlet for Cold War competition. However, Stuhlinger 
wisely closes the letter with a reference to an enclosed photograph of the Earth 
from the Moon and hints at its worldview changing implications. The 1968 pic- 
ture, now referred to as ‘Earthrise; was later described by nature photographer 
Galen Rowell as ‘the most influential environmental photograph ever taker’ 
(Henry & Taylor 2009). Sometimes the greatest benefits cannot be quantified. 


Implications for Research Allocation 


There appears to be no method for assessing the benefits of research that is 
comprehensive, objective, and quantitative. This can make any research assess- 
ment process rather contentious if all the stakeholders are not already in agree- 
ment. Some science policy experts have suggested that the best science funding 
strategy is simply stable investment over time (Press 2013). The NSF estimated 
that over $1 billion was spent over a 40-year timespan on the search for grav- 
itational waves. The result was a technical and intellectual achievement that 
yielded a Nobel Prize and a new sub-field of astronomy. 

However, without the benefit of hindsight, it is hard to present a clear 
justification of what constitutes optimal research support. And without 
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justification, proposed funding goals can appear arbitrary and claims of shortages 
or impending crises may be met with skepticism (Teitelbaum 2014; National 
Science Board 2016). While this advice rightly acknowledges that research 
budgets should not be based on the perceived viability of individual projects, it 
fails to resolve the question of selection. Should policymakers treat and fund all 
research requests equally? 

Clearly, the general public does have science research priorities. A quick 
internet search of charities operating in the US yields dozens of charities that 
include cancer research as part of their mission but none for particle physics. 
The intellectual pleasures of discovering the Higgs boson in 2013 were real, but 
medical science, with its more immediate application to human health, attracts 
considerably more public attention. This exact allocation issue was recognized 
50 years ago by philosopher Stephen Toulmin who wrote ‘the choice between 
particle physics and cancer research becomes a decision whether to allocate 
more funds (a) to the patronage of the intellect or (b) to improving the nation’s 
health. This is not a technical choice, but a political one’ (Toulmin 1964). The 
purpose here is not to argue over whether medical research is more worthwhile 
than particle physics. Rather, it is to highlight how different methods of valuing 
research have ethical and pragmatic dimensions that effect science policy. A 
jobs-only valuation approach might prefer funding particle physics research for 
the many construction and engineering jobs it supports. Meanwhile, an econo- 
metric approach might prefer medical research based on historical growth 
rates in the pharmaceutical sector. Finally, a knowledge output approach might 
be ambivalent between the two options. 

Of course, even with explicit consideration, the expression of public values"! 
in science policy is not assured in the near term. For example, if a nation chose 
to scale back on ‘curiosity’ science, it is not clear that displaced scientists and 
engineers would necessarily start working on applied projects that would more 
directly minimize human suffering. Scientists and engineers are not fungible 
commodities, neither are they devoid of personal preferences regarding how 
they spend their time—research is not simply a zero-sum game. Likewise, pub- 
lic research funding is generally small compared to many other government 
expenditures, which may have considerably less societal benefit. In this cen- 
tury, the US federal budget for science research has been approximately one 
tenth of military spending. One can only imagine the benefits to humanity if 
those numbers were reversed. 

William Press, president of the American Association for the Advancement 
of Science, stated ‘[a] skeptical and stressed Congress is entitled to wonder 


1 Public values can be defined as the ethical consensus of society on what constitutes 
the rights, freedoms, and duties of individuals, organizations, and society. This defi- 
nition also acknowledges that public values are not necessarily fixed, monolithic, or 
entirely compatible with each other (for example, valuing both liberty and security) 
(Bozeman 2007). 
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whether scientists are the geese that lay golden eggs or just another group of 
pigs at the trough’ (Press 2013). Questioning the social value of science was 
prevalent in the early 20th century (Bernal 1939), but this skeptical attitude 
about the US science community fell out of favor for several decades after Van- 
nevar Bush rather successfully argued that science should be insulated from the 
political process (Bush 1945).!? Nonetheless, research assessment and funding 
decisions have always been predicated on an expectation of societal benefit. 
The predominant belief has been that research funding directly translates into 
knowledge and innovation. The problem is determining exactly what those 
benefits are and what they are worth. 

In summary, there is no universally acceptable method for assessing the 
benefits of research. This does not mean that assessing the benefits of science 
research is impossible or uninformative, only that formal quantitative benefits 
assessments should be used with extreme caution. Quantitative results that 
appear objective may be hiding a great deal of subjectivity. Failure to consider 
the limitations of each method risks letting the chosen method shape the goal 
of the assessment—the reverse of what constitutes good policymaking. 


See Guston (2000) for a more detailed discussion of the history of changing expecta- 
tions of science. 


Values in Risk Assessment 


Having discussed how the benefits of research are assessed, we turn our attention 
to assessing risks. The field of risk analysis is often categorized into three 
main branches: risk assessment, risk management, and risk communication 
(Paté-Cornell & Cox 2014). Risk management, the process of deciding how 
to address risks, is widely understood to include subjective value judgments 
(Aven & Renn 2010; Aven & Zio 2014). Meanwhile, there is an idealized notion 
that good risk assessments are relatively free of values (Hansson & Aven 2014). 
However, despite our best efforts at quantitative rigor, the outcomes of risk 
assessments reflect the many value judgments implicit in the assumptions of 
the analysis (Ruckelshaus 1984; MacLean 1986; Shrader-Frechette 1986, 1991; 
Cranor 1997; Hansson 2013). Unfortunately, this common misconception 
about the nature of risk assessment means that pervasive but overlooked value 
judgments can transform seemingly objective assessments into stealth policy 
advocacy (Pielke Jr 2007; Calow 2014). 

Policy analysis that is useful to stakeholders requires the clear identification 
of all significant assumptions and judgments (Morgan, Henrion & Small 1990; 
Fischhoff 2015; Donnelly et al. 2018). Delineating assumptions can give an 
analyst insight into how to minimize unnecessary assumptions and account 
for the remaining assumptions in a more transparent manner. However, this is 
tricky because there is no systematic list of value assumptions in risk analysis 
to consult. Even if there was such a list, it would be controversial—value judg- 
ments are difficult to recognize. 

Rather than attempt the Sisyphean task of exhaustively detailing every 
possible value assumption, the intent here is only to discuss some of the most 
common and contentious value judgments to illustrate the inherent subjectiv- 
ity of risk assessment. This is useful because subject experts attempting a risk 
assessment for potentially dangerous science may be in unfamiliar territory. 
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Many scientists are trained to view subjectivity as a sign of incompetence or 
unprofessional behavior. This is an unnecessarily narrow view of science and 
a particularly unhelpful attitude in risk analysis. A general roadmap of value 
assumptions may make a more convincing argument that all risk assessments 
involve unavoidable and important value assumptions that, if ignored, decrease 
the credibility of a risk assessment and its utility in formulating public policy. 

It should be noted that the following discussions of each topic are only brief 
introductions with relevant references for further exploration. Some topics, 
such as the treatment of uncertainty, can be rather technical. The point here is 
only to show that conflicting schools of thought exist for many of the consid- 
erations within a risk assessment. 


Categorizing Assumptions 


The term values is used here to mean a broad class of knowledge that exists 
between facts and accepted theory at one end of the spectrum and mere opin- 
ion at the other. Here, values are conclusions, based on the same set of available 
data, over which reasonable people might disagree. This definition describes 
values as having a basis in facts that are used to form reasons for preferring 
one thing over another (MacLean 2009). For example, preferring oysters rather 
than chicken for dinner because it tastes better is an opinion. Preferring oysters 
rather than chicken because it is healthier or more humane are values. Those 
preferences are values because they are based on some underlying facts— 
specifically, oysters have higher iron content than chicken and oysters have 
much simpler nervous systems—but the implications of those facts are open to 
interpretation and subject to varying degrees of public consensus. 

One might claim any position that has a basis in fact can, in theory, be 
determined to be true or false by constructing a logical argument by Socratic 
dialogue or similar means. However, in practice individuals have taken posi- 
tions (core beliefs, ideologies, schools of thought, etc.) from which they will 
not be easily dissuaded without overwhelming evidence that may not currently 
exist. This is not to say these value disputes will never be settled, just not yet. 
And here we run into the greatest challenge in risk assessment—the need to 
say something substantial about a decision that must be made in the present 
despite a state of approximate knowledge and uncertainty. The result is risk 
assessments rife with value judgments. 

Value judgments in risk assessment can be categorized in multiple ways. One 
popular distinction is between two general classes of values: epistemic and 
non-epistemic (Rudner 1953; Rooney 1992; Hansson & Aven 2014). Epistemic 
value assumptions pertain to what can be known and how best to know it. Epis- 
temological arguments, traditionally discussed in the philosophy of science, are 
nonetheless values in that they embody judgments of relative merit (Hempel 
1960). Non-epistemic value assumptions (i.e., ethical or aesthetic values) 
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typically deal with what ought to be or what is desirable or acceptable—keeping 
in mind that what is socially acceptable and what is ethical are not always 
the same thing (Taebi 2017). It is important to note the classification of value 
judgments is not static. Some epistemic value judgments, such as the choice of 
appropriate statistical techniques, may move along the spectrum of knowledge 
toward accepted theory as evidence accrues. Likewise, some ethical value 
judgments eventually become universal and are no longer a source of dispute. 
History is full of behaviors that were acceptable a century ago but are universally 
condemned today—as well as the reverse. 

While the epistemic/ethical distinction has philosophical importance, it is 
less critical for risk analysts because, as discussed later, some assumptions can 
be justified by both epistemic and ethical reasons. Rather, organizing value 
assumptions by where they arise in the analysis process is more useful for 
instructive purposes. Momentarily ignoring that the process is iterative, the 
following is a discussion of the value assumptions that arise in each step of a 
typical risk assessment. 


Selection of Topic 


Commonly, an analyst will be employed to conduct an assessment of a specific 
risk. However, when not already predetermined, the first value judgment made 
in any risk assessment is the choice of topic. When tasked with evaluating risk 
within a large pool of potential hazards, a screening mechanism or set of cri- 
teria is needed that involves some epistemic and/or ethical value judgments to 
prioritize efforts. People employ heuristics (i.e., mental shortcuts) when assess- 
ing commonplace situations that can lead to biases—risk assessment is no dif- 
ferent (Tversky & Kahneman 1974; Slovic, Fischhoff & Lichtenstein 1980). 

Because risk assessments are generally performed for situations that are 
perceived to be dangerous rather than benign, an analyst’s perception of risk 
gives rise to important judgments on topic choice. For example, technological 
hazards are perceived to be more controllable than natural hazards, but also 
more dangerous and more likely (Baum, Fleming & Davidson 1983; Brun 
1992; Xie et al. 2011). If the public perceives anthropogenic risks to be more 
threatening than natural risks, the result should be a tendency to conduct 
more technological risk assessments while overlooking equally or more risky 
natural hazards.' 

Likewise, a risk assessment is often influenced by how easily we can imagine 
a risk—a bias known as the availability heuristic (Iversky & Kahneman 1973). 
Funding for a risk assessment often appears when a threat is fresh in the minds 


' This phenomenon may be partially responsible for inadequate natural disaster pre- 
paredness. A related question is whether many nations allocate too many resources 
to military spending rather than natural disaster mitigation. 
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of a funding organization, rather than when the risk is the greatest (assum- 
ing actual risk is even knowable). Thus, assessments tend to follow current 
events. For example, newsworthy meteorite impacts and near-misses tend 
to periodically renew interest in assessing the risk of major impacts from 
near-earth objects. The value judgments involved are both epistemic and 
ethical. Epistemic in the sense that there is an assumption that the hazard is 
now more real and knowable and ethical in the sense that the hazard is now 
viewed as more worthy of analysis than other hazards. 

Before we move on to other concerns, it is important to note that the term 
bias shall be used here as traditionally used among decision science experts 
who have made careers enumerating and explaining the various ways in 
which humans frequently make terribly illogical decisions when confronted 
with unfamiliar low-probability, high-consequence risks.” That said, decision 
science tends to focus on the negative anti-rational behaviors. However, some 
decision-making behaviors that fall outside of traditional rationalism can also 
improve decisions—moral duty, desire for autonomy, acts of altruism, and so 
forth. Bias is not always bad. 


Defining System Boundaries 


For any assessment, the boundaries of the analysis must be selected. This 
includes time frame, spatial scale, and relevant populations. While the bounda- 
ries are primarily determined by the topic, some epistemic value judgments are 
involved. First, the analyst must believe meaningful boundaries can be defined 
for a system. The idea of inherent interconnectedness has long existed in the 
philosophical traditions of Buddhism and Taoism but was an uncommon idea 
in Western science until the 20th century when the emerging field of ecology 
led to sentiments such as, “When we try to pick out anything by itself, we find 
it hitched to everything else in the Universe; (Muir 1911) and ‘Everything is 
connected to everything else’ (Commoner 1971). Nonetheless, the reductionist 
approach to science has enjoyed a great deal of popularity and success. 

Risk assessments are often performed for natural and social systems for 
which an underlying hierarchical structure is not yet understood. Thus, 
the analyst is forced to make bounding decisions based on scientific (i.e., 
epistemic) judgments. Ideally, expanding an assessment to add secondary 
or tertiary factors will make incrementally smaller changes to the results 
thereby showing the assessment to be convergent, but this is not always the 
case. Practical limitations of available data and funds tend to dictate that a risk 
assessment be narrowly defined, yet synergistic effects, non-standard uses, or 


? For a particularly readable summary of decision biases, see The Ostrich Paradox 
(Meyer & Kunreuther 2017). 
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other sociotechnical surprises can play a significant role in the overall risk of a 
new technology (Jasanoff 2016). To complete the assessment, the analyst must 
believe it is possible to know what factors can be left out without appreciably 
affecting the results. 

Ethical and epistemic considerations also pervade many choices regarding 
the relevant population in a risk assessment. Should an assessment be restricted 
to humans or should it include all sentient animals or even an entire ecosystem? 
Should the study involve only current populations or include future generations? 
These value assumptions can be influenced by many factors, such as type of 
academic training. For example, one study found that physicists and geologists 
were more likely to believe that performing a risk assessment of a long-term 
nuclear waste repository was a reasonable task, while anthropologists and phi- 
losophers were considerably more skeptical (Moser et al. 2012). 


Method of Assessment 


A variety of decisions about how risk assessments are conducted can have 
profound impacts on the results. In addition to the usual debates about accept- 
able scientific methodology, there is also considerable variation in the risk 
conceptual frameworks that have been adopted by different organizations, 
countries, and academic communities (Clahsen et al. 2019). The following 
highlights some of the more controversial choices that must be made. 


Unit of assessment 


An analyst must decide whether the risks will be expressed in monetary units 
or another unit relevant to the risk, such as injuries/year. Non-monetary units 
of risk are commonly used in pharmaceutical, healthcare, and epidemiological 
risk assessments. However, it becomes more difficult to compare these assess- 
ments with non-medical priorities and to make policy recommendations that 
consider financial resources. Units are also important in that a different unit 
can change the perception of the risk (Wilson & Crouch 2001). Depending 
on the situation, a relative risk can appear to be much larger than an absolute 
risk or vice versa. For example, a WHO report regarding the 2011 Fukushima 
nuclear accident estimated that the lifetime risk of thyroid cancer for nearby 
infant females increased from about 0.75 to 1.25 percent. Depending on the 
desired effect, media reported the findings as a 70 percent relative rate increase 
or a 0.5 percent absolute rate increase. Subsequent mass-screening of Fuku- 
shima children for thyroid abnormalities resulted in many unnecessary medi- 
cal procedures and considerable public anxiety (Normile 2016). Conversely, 
in the H5N1 avian influenza research debate discussed in the first chapter, 
one critique of the conducted risk-benefit assessment was that the consultant 
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used relative probabilities rather than absolute probabilities, which appeared to 
understate or at least obscure the risk. 

These choices are intertwined with risk communication and can have seri- 
ous implications. A risk can be presented (intentionally or not) in such a way 
as to minimize or inflate the severity of a particular hazard. The influence of 
context—also known as framing effects (Tversky & Kahneman 1981)—is sub- 
stantial. A basic aspiration of risk assessment is to provide unbiased informa- 
tion, so it is generally considered unprofessional to use framing to make a risk 
assessment appear more favorable to an analyst's preferences. However, there is 
widespread support for framing risk outcomes that encourage environmental 
responsibility, prosocial behavior, or positive health practices (Edwards et al. 
2001; Gallagher & Updegraff 2012). So it seems that framing is bad, unless it 
is used with good intentions. How sure are we that someday those good inten- 
tions will not be seen as socially or scientifically misguided? Clearly, value judg- 
ments in risk assessments are not only pervasive, but also complicated. 

Selection of a unit also involves an epistemic value judgment regarding the 
measurability of a characteristic. Presumably, an analyst would not pick an 
unmeasurable unit. The choice of units is also accompanied by important, and 
often unwitting, ethical assumptions. For example, the units of lives saved, life 
years gained (LYs), quality adjusted life years (QALYs), and disability adjusted 
life years (DALYs) all preferentially benefit different populations (Robberstad 
2005). Likewise, the disparate nature of various risks and benefits often requires 
the use of assumption-laden conversion factors and equivalencies to make com- 
parisons. Ideally, an analyst should present an assessment using multiple units 
to give perspective and aid comparisons, but this requires more time and effort. 


Value of life 


The primary benefit of using a monetary unit is the ability to integrate the 
results into a larger economic analysis. However, the conversion to monetary 
units requires some important value assumptions. The most controversial is the 
need to monetize the value of life. Attempts to quantify the value of life often 
use willingness-to-pay measurements or expert opinion. A defense of these 
estimates is that the number does not represent the worth of an actual life but 
rather the rational amount society should be willing to spend to decrease the 
probability of a death when individual risk is already low. However, measure- 
ment techniques can confuse willingness-to-pay with ability-to-pay—a much 
less ethical measure because it undervalues the lives of the poor. Likewise, indi- 
vidual behavior is inconsistent; the amount an individual will pay to avoid a 
risk often differs from the amount the same individual must be paid to take a 
risk (Howard 1980). Furthermore, an individual’s willingness-to-pay to save 
a life appears to vary depending upon whether the choice is made directly or 
indirectly through a market (Falk & Szech 2013). 
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Willingness-to-pay methods have two other fundamental difficulties 
(MacLean 2009). First, public well-being and willingness-to-pay are not 
always equivalent. Some individuals have preferences that are counter to the 
well-being of society or even their own well-being. Second, economic valua- 
tion is an incorrect way to measure many abstract and essential values, such 
as duties associated with religion or community. Individual behavior is often 
inconsistent when people are asked to put a price on deeply held values, and 
such requests are frequently met with moral outrage (Tetlock 2003). 


Discount rate 


Another contentious issue regarding the use of monetary units is the choice of 
discounting rate or how much future money is worth now. One can choose a 
high discount rate, such as seven percent—the average return on private invest- 
ment; a low discount rate, such as one percent—the typical per capita con- 
sumption growth; or even a discount rate that declines over time to account 
for the uncertainty of future economic conditions (Arrow et al. 2013). The eco- 
nomic implications of selecting a discount rate are complicated enough, but the 
discount rate is also a proxy for complex intergenerational fairness issues—how 
we account for a future generation’s values, technology, and wealth. Selecting 
higher discount rates minimizes future costs and tends to place more burden 
on future generations. The time range of the assessment determines the impor- 
tance of the discount rate. For example, the choice of discount rate is often the 
most fundamental source of disagreement in long-term climate change eco- 
nomic risk assessments (Stern 2006; Nordhaus 2007; Stern & Taylor 2007). 
The public commonly minimizes the value of future lives (Cropper, Aydede 
& Portney 1994). Even the basic idea of ethical consideration for future genera- 
tions is not universally accepted (Visser’t Hooft 1999). However, international 
laws increasingly acknowledge that physical distance is no longer an excuse 
for exclusion from risk considerations; eventually, separation in time may no 
longer be an acceptable reason for ‘empathic remoteness’ (Davidson 2009). 


Other methodological considerations 


One of the first methodological decisions is whether to perform a qualitative 
or quantitative risk assessment. Selecting a qualitative assessment may indi- 
cate deep uncertainty, lack of confidence in available data, or even mistrust of 
available quantitative methods. For example, the FDA issued new guidelines 
in 2013 that rejected purely quantitative risk-benefit assessments for new drug 
approvals because they often leave out important factors that are difficult to 
quantify. Likewise, a quantitative assessment may be selected for good rea- 
sons, such as strong past performance, or bad ones, such as to use numbers to 


36 Dangerous Science 


imbue an assessment with an air of authority. The academic prestige associated 
with mathematical analysis contributes to ‘a frequent confusion of mathemati- 
cal rigor with scientific rigor’ (Hall 1988). There is also a general bias against 
qualitative work in many scientific fields that might steer an analyst toward a 
quantitative assessment to avoid being accused of speculation. 

The form of assessment will also depend on the exact definition and 
treatment of risk used. There are multiple common meanings of the term risk: 
an undesirable event, the cause of an undesirable event, the probability of an 
undesirable event, the expectation value of an undesirable event, or a decision 
made with known probability outcome (Möller 2012). These definitions of risk 
range from vague to precise, and their frequency of usage varies by academic 
discipline (Althaus 2005). The public tends to use risk qualitatively and com- 
paratively; whereas, decision theorists and economists are more likely to use 
the more quantitative definitions (Boholm, Moller & Hansson 2016). When 
the public uses a more expansive conception of risk, professionals may often 
dismiss public sentiments as a biased overestimation of a small risk (Aven 
2015). However, even experts tend to use the term inconsistently, which adds 
to the confusion. 

In the relatively young field of risk analysis, the expectation value interpreta- 
tion of risk has become widespread along with the use of probabilistic risk anal- 
ysis. In practice, this simply means risks are compared by combining events 
with their probability of happening. For example, in a hypothetical game of 
chance where you win or lose money, there could be a 50 percent chance of 
losing $1 and a 50 percent chance of winning $2. The expected utility of the 
game would be 


0.5 x -$1 + 0.5 x $2 = $0.50 


If you played the game repeatedly, you would, on average, expect to win 50 
cents per game. Expected utility, part of rational choice theory, can be a handy 
method of comparing options in many economic decisions. Unfortunately, 
expected utility can be an unhelpful way to express events that are infrequent 
and serious—a common occurrence in risk assessments (Hansson 2012). For 
example, using the previous game, what ifinstead there was a 50 percent chance 
of losing $10,000 and a 50 percent chance of winning $11,000. In this case, the 
expected utility would be $500. That is a much better expected utility than the 
original game, but fewer people would be willing to play the second game. The 
reason is explained by prospect theory (Kahneman & Tversky 1979), which 
argues that people have an aversion to large losses that causes deviations from 
rational choice theory. The larger the loss, the less likely people will attempt to 
maximize expected utility. 

Additionally, risk assessments frequently include important ethical consid- 
erations missed by expected utility. For example, a 10 percent chance of one 
death has the same expectation value as a 0.01 percent chance of 1,000 deaths. 
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However, few people would say these two possibilities are morally equivalent. 
Despite these obvious issues with expected utility, it is still commonly used in 
risk assessments. 

A related concern is with the criterion of economic efficiency—an assump- 
tion used when attempting to maximize expectation values. Maximizing the 
utility for the most likely outcome seems like a good idea, but it may come at 
the cost of potentially missing less efficient, but more robust options (Ben-Haim 
2012). There is an ethical decision here. Is it better to favor efficiency to avoid 
waste or is it better to favor robustness—options that work over a wide range 
of situations—in order to minimize worst-case scenarios? These competing 
values are frequently encountered elsewhere in life. For example, should one 
invest in stocks with the highest rate of return or in less lucrative, but more sta- 
ble investments? The public wants airplanes and bridges to be built economi- 
cally to minimize costs and avoid wasting resources but with a large enough 
safety factor to protect the lives of those using them. 

In the case of extreme uncertainty or ignorance, it is pointless to attempt to 
maximize expectation value. Rather, analysts should encourage ‘robust satisfic- 
ing’ (Smithson & Ben-Haim 2015), qualitatively optimizing against surprise. 
This entails retaining as many acceptable options as possible (while also avoid- 
ing indecision) and favoring options that are the most reversible and flexible. 


Treatment of Error 


In any scientific statement or statistical test, there are two types of errors that 
can be made. A Type I error is finding an effect or phenomenon where it does 
not exist (incorrectly rejecting a null hypothesis). A Type II error is failing to 
find an effect that does exist (accepting a false null hypothesis). Traditionally, 
scientists have focused on Type I errors because the emphasis is on avoiding 
the addition of false theories to the corpus of science (Hansson 2012). However, 
an emphasis on rejecting false positives will likely miss real hazards for which 
there is not yet conclusive evidence. This is the reason why some products are 
banned many years after they are first introduced—it may require considerable 
data to build a case that will survive Type I error rejection. In risk assessment 
and public policy, concentrating on Type II errors, false negatives, may be pre- 
ferred (Douglas 2000). That is, the epistemological values of traditional scien- 
tific inquiry may not be appropriate for risk assessments. This suggests the need 
to select different criteria of what constitutes acceptable evidence. However, it 
is not always as simple as reanalyzing data with different criteria for rejection. 
One approach is to use the precautionary principle, which can be viewed as a 
qualitative attempt at minimizing Type II errors within a body of science that 
was generated using Type I error minimization criteria. However, detractors of 
the precautionary principle believe an emphasis on Type II errors strays too far 
from defensible science (Sunstein 2005). 
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Model Selection 


Risk assessments require selecting a risk model, and the selection process always 
involves value judgments. For example, simply fitting data to a standard dose- 
response curve can be accomplished by a variety of similar statistical techniques: 
maximum likelihood, non-linear least squares, piecewise linear interpolation, 
and so forth. Selecting the method is an epistemic value judgment, and its effect 
on the outcome of the analysis may or may not be trivial. For example, both 
log-normal and power-law distributions are highly right-skewed (heavy-tailed) 
probability distributions that yield more extreme large events than normally- 
distributed phenomena, but distinguishing between the two when fitting data 
can be difficult (Clauset, Shalizi & Newman 2009). However, the distinction can 
be important because log-normal distributions have a well-defined mean and 
standard deviation; whereas, power-law distributions sometimes do not and 
their approximated average is dependent on the largest estimated or observed 
event (Newman 2005). This could substantially affect the results of a risk assess- 
ment that uses mean estimates of hazard magnitude (Hergarten 2004). This 
behavior is particularly relevant to pandemic risk assessments because the size 
of various epidemics may follow power-law distributions without finite means, 
including cholera (Roy et al. 2014), measles (Rhodes & Anderson 1996), and 
early-stage influenza (de Picoli Junior et al. 2011; Meyer & Held 2014). 

Other decisions are a mix of epistemic and ethical. For example, when 
selecting a low-dose exposure risk model, an analyst might choose a linear or 
non-linear model (Calabrese & Baldwin 2003) based on an epistemic value 
judgment. However, the selection may also be an ethical value judgment 
reflecting the analyst’s belief regarding whether a model should strive to be as 
scientifically accurate as possible or whether it should err on the side of being 
conservatively protective (Nichols & Zeckhauser 1988; MacGillivray 2014). 

As another example, epidemiologists may choose to use a continuous or 
discrete mathematical model to represent the behavior of an epidemic. A discrete 
model is useful in that it is easier to compare to epidemiological data (which is 
usually collected at discrete times) and is easier for non-mathematicians to use 
(Brauer, Feng & Castillo-Chavez 2010). However, discrete models sometimes 
exhibit very different dynamics than the continuous models they are intended to 
duplicate when used outside of a narrow set of parameters (Mollison & Din 1993; 
Glass, Xia & Grenfell 2003). Thus, researchers must make judgments regarding 
the relative value of correctness versus tractability when selecting models. Some 
of the same concerns arise when selecting between an analytical model and an 
approximated numerical model. 


Theoretical versus empirical choices 


The basis of a risk assessment model may be primarily theoretical or 
empirical, although this distinction is somewhat artificial. Because risk 
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estimates involve projections into the future, they all have a theoretical 
component—even empirical trend extrapolations rely on the theory that future 
behavior will mimic past behavior. Nonetheless, there are special considera- 
tions for models that are more empirical or theoretical in nature. For example, 
defining and characterizing uncertainty in theoretical models requires more 
assumptions because most uncertainty characterizations rely on statistical 
techniques which require data. Likewise, empirical models must assume the 
data used to create the model are representative of the full range of possibilities 
(Lambert et al, 1994). In The Black Swan, Nassim Taleb illustrates this 
data incompleteness problem with a parable about a turkey that believes the 
farmer who feeds him daily is his best friend—until Thanksgiving eve when 
the farmer kills the turkey without warning. The last data point turned out to 
be the important one. 

Opinions on empiricism constitute an important value assumption that 
rests on the perceived relationship between data and theory. A simplistic 
conception of science is that observations are used as a basis to form theories 
and subsequent observations then support or refute those theories. However, 
facts and observations are theory-laden (Feyerabend 1975; Mulkay 1979; 
Rocca & Andersen 2017). While theories are often inspired by observations, 
these observations are unconnected until interpreted within a theoretical 
framework. Both data and theory are intertwined. Detailed observations 
of celestial motion by ancient astronomers were hindered from providing 
more insight by a persistent theory of geocentrism. Misinterpreting a lot of 
data with the wrong model only improves the precision of the error. In the 
Thanksgiving turkey parable, all the data was leading toward the wrong con- 
clusion because the turkey misunderstood the essential relationship between 
turkeys and farmers. 

There is a growing emphasis on empiricism in the era of Big Data, but data 
mining is helpful only if the appropriate data are analyzed with the correct 
theoretical interpretation. Likewise, claims that we now live in an era of data- 
driven science are only partly correct. It appears that theory is often trying 
to catch up with the mountains of data science produces, but no one collects 
and analyzes data without at least a simple implicit underlying theory. The 
2008 global financial crisis occurred despite (and perhaps because of) count- 
less financial risk models with copious data that failed because they were miss- 
ing key theoretical dependencies. The use of big data has also been turned to 
the ‘science of science’ to explore the predictability of scientific discovery only 
to find fundamental unpredictability in the absence of an underlying theory 
(Clauset, Larremore & Sinatra 2017). 

Likewise, it is a value judgement to dismiss non-empirical methods for 
generating scientific knowledge or assessing risks. The Gedankenexperiment 
(thought experiment) has been used with great success in many fields ranging 
from philosophy to physics. Sadly, not all fields of science yet appreciate the 
importance of theory-building as a complement to, rather than a component 
of, fact-gathering (Drubin & Oster 2010). 
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Level of complexity 


Selecting the level of model complexity in a risk assessment entails some value 
judgments. Analytical complexity is often equated with thoroughness and 
appropriate representation of reality. Meanwhile, proponents of simplicity will 
invoke Occam's razor—the principle that simpler explanations are preferred. 
A common theme in modeling is that the ideal model is as simple as possible 
while still aiding in informed decision-making (Vezér et al. 2018). But what is 
the appropriate level of detail? 

Analysts face the competing goals of representativeness and usefulness. A 
broad and comprehensive assessment may offer a nuanced description of risk, 
but such completeness may not lend itself to the clear comparisons needed for a 
policy decision. Level of complexity is usually a tradeoff. While a simple assess- 
ment may be easier to explore and explain, it runs a higher risk of missing 
critical relationships. Meanwhile, a complex assessment has a better chance of 
capturing all the salient components of a system, but it is also harder to evalu- 
ate, understand, and compare to competing assessments (von Winterfeldt & 
Edwards 2007). There are methods, such as hierarchical holographic modeling 
(Haimes 1981; Haimes, Kaplan & Lambert 2002) and fault tree analysis (Vesely 
et al. 1981), which can help enumerate all of the potential interactions within a 
complex system, but no systematic guide for inductively (bottom-up) or deduc- 
tively (top-down) investigating risk can guarantee completeness. 

One might presume complex systems generally require more complex analy- 
sis. However, simple modeling may be desirable when there is a lack of theory 
or when complex models are known to lack predictive ability. For example, the 
time between large earthquakes appears to follow an exponential distribution. 
Because this process is memoryless (the time between events is independent 
of previous events), simple average time between events is just as useful as a 
more complex model (Cox 2012a). A similar issue can arise when successful 
methods are applied to new situations. Seeing the success of complex quanti- 
tative modeling in the engineering sciences, analysts sometimes overreach by 
applying the same techniques to poorly-understood complex systems (Pilkey & 
Pilkey-Jarvis 2007). An approach that works for buildings and bridges may not 
work when modeling ecological systems. 

There is also good reason to be cautious about the lure of nuanced models 
and theories (Healy 2017). While the general consideration of nuance is 
laudable, there are several nuance traps that theorists frequently fall into: the 
urge to describe the world in such fine-grain empirical detail that it provides 
no generalizable insight; the tendency to expand and hedge a theory with 
particulars in such a way as to close off the theory to rebuttal or testing; or the 
desire to add nuance merely to demonstrate the sophistication of the theorist. 
All these forms of nuance would result in a more comprehensive, but also 
less useful, risk assessment. In general, generative or exploratory work should 
emphasize foundational issues and insight over nuance; whereas, explanatory 
and evaluative work is necessarily narrower and more detailed. 
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Although the goals of the assessment and the subject investigated should 
dictate the level of complexity, there is plenty of subjective flexibility for the 
analyst. As previously mentioned, mathematical complexity tends to imbue an 
analysis with an air of legitimate sophistication even when the rational basis 
for the numerical valuation is weak. Conversely, explanations that are simple, 
symmetrical, or otherwise clever are often considered more elegant and prefer- 
able to more cumbersome explanations (Hossenfelder 2018). It is important to 
recognize that these are essentially aesthetic value judgments.’ 


Data Collection 


Value judgments exist throughout the data collection process. Many of these 
happen outside the control of the risk analyst due to widespread publica- 
tion bias (Young, Ioannidis & Al-Ubaydli 2008). Available data are limited by 
many factors, including the common value judgment that null results are not 
valuable information (Franco, Malhotra & Simonovits 2014), the preferential 
over-reporting of false positive findings (Ioannidis 2005), the apparent bias 
against research from developing nations (Sumathipala, Siribaddana & Patel 
2004), the bias toward publishing already distinguished authors (Merton 1968), 
or the difficulty of finding research not published in English—the lingua franca 
of international science. The bias against negative findings is particularly wide- 
spread because private sponsors of research will often discourage investiga- 
tors from publishing negative results for financial reasons. This can result in a 
published paper showing, for example, the efficacy of a drug despite multiple 
unpublished studies showing no effect that the scientific community and public 
never see. An effort to address the issue has been made in the medical commu- 
nity by creating a database of clinical trials before they start, but the problem 
is pervasive and longstanding—the bias for positive effects was first discussed. 
by Francis Bacon in the 1620 Novum Organon. There is also increasing aware- 
ness that the widespread journal publication bias toward original findings is 
a primary cause of an existing reproducibility problem. If researchers are dis- 
couraged from ever treading old ground, how can we be sure that established 
science is really established? 

Counterintuitively, the bias toward original findings is also accompanied by 
a bias against novelty. That is, there is a preference for results that are new, but 
only incrementally so. This conservatism is partly propelled by a hypercompet- 
itive research funding market that tends to reward researchers who can prove 
they have already successfully performed similar research (Alberts et al. 2014). 
Another factor is the idea famously summarized by physicist Max Planck that 
science advances one funeral at a time. Like all social endeavors, science has a 


> There may be a natural human aesthetic predisposition for reductionism—a 
technique successful in both science and abstract art (Kandel 2016). 
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hierarchy, and there is a general tendency to repress work that contradicts the 
views of the most eminent scientists in a field—until they graciously accept the 
new theory or ‘retire’ (Azoulay, Fons-Rosen & Zivin 2015). 

That said, many value decisions are made by the analyst. 


Data screening 


Analysts make a variety of value judgments regarding what data should be 
incorporated into a risk assessment. Data screening criteria can include rele- 
vance of search terms, reputational selection of sources, adherence to particular 
laboratory practices, findings with a certain statistical significance, or surrogate 
data similarity to the risk in question (MacGillivray 2014). What counts as rel- 
evant data can bea particularly contentious epistemic value judgment based on 
widely-held reductionist views of causality—dating back to philosopher David 
Hume—that tends to exclude some forms of otherwise compelling evidence 
(see Anjum & Rocca 2019). These subjective decisions are one reason why 
meta-analyses, studies that quantitatively combine and assess prior research on 
a subject, frequently come to contradictory conclusions. 

While some criteria, such as statistical significance, are primarily epistemic 
value decisions, others are largely ethical. For example, it is common to treat 
potential harmful effects below the current detection limit as acceptable risks 
(Hansson 1999). While this may be pragmatic, it rests on the assumption a 
low-dose threshold is well-established and uncontroversial or simply what you 
do not know cannot hurt you. An even more general bias in data screening 
is the tendency by experts to ignore evidence that has not been transmitted 
through the scientific community. Paul Feyerabend frequently noted the hubris 
of scientists who ignored the accumulated practical knowledge of history and 
local experience. But what should a risk analyst do? It is troublesome to assess 
the quality or get general acceptance of knowledge that has not already passed 
through peer review. Sometimes the best one can do is to actually look for 
indigenous knowledge and note when it contradicts accepted research. 

Similarly, some science policy experts (Sarewitz 2016; Martinson 2017) 
argue the scientific community is generating too much research of poor qual- 
ity, which makes finding the valid research all the more difficult. Their solution 
is to encourage scientists to publish more thoughtfully and less often. While it 
is hard to argue with the thoughtful part, the implementation sounds troubling. 
Yes, a great deal of research is not high quality, but—as discussed in the previ- 
ous chapter—only time and many eyes can tell if any particular research has 
value. Ironically, publishing is still the best way to disseminate information and 
improve the corpus of science.* 


* I also disagree with the very idea of ‘too much research. We are not running out of 
scientific questions, so any increase in the number of scientists is generally positive. 
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Data for rare events 


Some risk events are so rare that little to no data are available. Several limited 
approaches have been used to address rare event risks, including statistical 
methods, such as bootstrapping (Efron & Tibshirani 1993); expert opinion 
and related variations, such as the Delphi method; using analogous situations; 
and bounding with scenarios (Goodwin & Wright 2010). Historical records, 
such as old tsunami markers in Japan, are an excellent source for scenario 
bounding of natural risk events. Even many anthropogenic risks, such 
as bioengineering and geoengineering, have analogous natural events for 
comparison. 

Sometimes the rarity of an event will lead a risk analyst to simply declare 
a risk non-existent—a rather bold epistemic assumption that should be used 
with caution. Although none are universally accepted, there are statistical 
methods for estimating the probability of events that have not occurred 
(Eypasch et al. 1995; Winkler, Smith & Fryback 2002; Quigley & Revie 
2011). It is better to declare something improbable unless you are sure it 
is impossible. 

One interesting subset of rare events is the low probability, high consequence 
risk—for example, a pandemic caused by avian influenza virus research. 
Low-probability events, as well as improbable, but unrefuted, catastrophic 
theories (Cirkovi¢ 2012), tend to get considerable attention in the media. One 
possibility is that the public merely overreacts to the ‘social amplification of 
risk (Kasperson et al. 1988). Alternatively, the attention may be a judgment 
by the public that catastrophic risks should not be reduced to probabilistic 
expectation values (Coeckelbergh 2009). Furthermore, from an epistemic 
perspective, the probability of occurrence is actually the probability of the 
event conditioned on the theory being correct (Ord, Hillerbrand & Sandberg 
2010). Thus, if the model is wrong, the rare event might not be as rare as experts 
estimate. Thus, public concern is, to some degree, a measure of their faith 
in experts. 


Curiosity is a nearly universal human trait. All one needs is some training in skep- 
tical analytical thinking to yield a scientist capable of doing meaningful science. 
Groundbreaking research can come from any imaginative, persistent, and/or lucky 
scientist. The idea of too much science is reminiscent of various claims that all the 
major discoveries of science have already been made (for example, science jour- 
nalist John Horgan’s 1996 book, The End of Science). Such arguments, dating back 
at least a century (Badash 1972), start with short-term trends based in facts and 
extrapolate to unsupportable conclusions. In short, who knows if what we claim 
to know is right, if some fundamental truths are undiscoverable, or what scientific 
revolutions we are capable of in the future? The only way to know is more science, 
more scientists, and more research. 
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Expert opinion 


The last, but probably largest, source of value judgments in data collection is 
the use of expert opinion. If the assessment is to use expert opinion, value judg- 
ments occur not just in the opinions of the experts themselves, but also in the 
selection of the experts and the method by which expert opinions are com- 
bined. There is no consensus on the best methods for collecting and combining 
expert opinions (Hammitt & Zhang 2013; Morgan 2014, 2015; Bolger & Rowe 
2015a, 2015b; Cooke 2015; Winkler 2015; Hanea et al. 2018). Furthermore, 
much has been said on the human imperfections of experts that affect their 
expertise (Laski 1931; Feyerabend 1978; Jasanoff 2016). 


Accounting for Uncertainty 


The treatment of uncertainty, which exists in all measurements and models, is, 
by definition, a fundamental issue for risk assessments. Whether deliberately 
or by accident, analysts make a variety of epistemic value choices on how to 
express and propagate uncertainty within a risk assessment. 


Deterministic versus probabilistic 


The easiest choice is often to temporarily ignore uncertainty and use a deter- 
ministic model to perform the risk assessment. A deterministic model can 
explore uncertainty by varying the model parameters and then building a range 
of scenarios. This at least gives the analyst a range of potential outcomes, albeit 
with no associated likelihoods. This approach limits the uses of a quantitative 
risk assessment but is useful when there is no defensible basis for quantifying 
uncertainty. 


Objective versus subjective probabilities 


Using the definition of risk as a probabilistic expectation value, the most popu- 
lar option in risk analysis is to use a probabilistic model where parameters and 
output are represented as distributions. In this case, the analyst must decide if 
the probabilities will be based solely on data or if they will also include subjec- 
tive probabilities (Hansson 2010b). The subjectivist approach to risk is increas- 
ingly preferred as it accommodates commonly used sources of uncertainty, 
such as expert opinion (Flage et al. 2014). This is often necessary when we 
consider trans-science—questions that are technically answerable by research 
but will not be answered because of practical limitations (Weinberg 1972). For 
example, if a particular failure risk for nuclear power plants was estimated to 
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be 1 in a million per year, it would not be empirically verifiable because, with 
only about 500 nuclear power plants worldwide, that failure would, on average, 
occur once in 2000 years. These types of low-probability events are unlikely to 
be testable or known with much confidence. 

Given the difference in confidence one might place on a probability computed 
from empirical data versus expert opinion, differentiating between empirical 
and subjective interpretations of uncertainty may be important—especially 
if these various sources of data are being combined within the same analysis 
(Doorn & Hansson 2011). In some models, second-order uncertainty 
(uncertainty about the uncertainty) is included, but the utility and interpreta- 
tion of such efforts are still not universally accepted. 


Hybrid probabilistic methods 


A third approach is to distinguish between two types of probability: aleatory 
(normal variation within a population or process) and epistemic (lack of 
knowledge or incertitude). Some risk analysts account for aleatory uncertainty 
with standard probability distributions and epistemic uncertainty with other 
techniques. One option is to represent epistemic uncertainty with intervals 
that represent upper and lower bounds on possible values. This is done in 
probability bounds analysis or the more generalized method of imprecise 
probabilities (Walley 1991; Ferson & Ginzburg 1996; Weichselberger 2000; 
Ferson & Hajagos 2004). 

Using such alternatives to probabilistic uncertainty is still uncommon for 
two reasons. First, the mathematical techniques are less familiar. Second, 
there is a tendency in decision theory to assume that all outcomes are 
reasonably well-known. This error of treating real-world uncertainty like 
the known probabilities found in casino games has been called the ‘tuxedo 
fallacy’ (Hansson 2009). 


Non-probabilistic methods 


A fourth approach is to use alternatives to (or extensions of) probability theory 
(Pedroni Nicola et al. 2017), such as evidence theory, also known as Demp- 
ster-Shafer theory (Dempster 1967; Shafer 1976, 1990), or possibility theory 
(Dubois & Prade 1988; Dubois 2006). Likewise, related issues of vagueness or 
ambiguity can be addressed by fuzzy set theory (Zadeh 1965; Unwin 1986). 
Newer risk analytic methods with novel treatments of uncertainty include 
info-gap analysis (Ben-Haim 2006) and confidence structures (Balch 2012). 
There are even semi-quantitative approaches that take into account the varying 
degrees of confidence we have in knowledge (Aven 2008). These methods can 
be used in risk assessments with alternative conceptions of risk. A traditional 


46 Dangerous Science 


risk definition is hazard multiplied by probability. An alternative risk perspec- 
tive defines risk as a consequence combined with uncertainty (Aven & Renn 
2009; Aven 2010). 


Uncertainty representation 


Methods of representing uncertainty reflect an analyst’s epistemological philos- 
ophy. This is in contrast to the moral uncertainty inherent in decision-making 
(Tannert, Elvers & Jandrig 2007). There is no current consensus on the best 
way to represent uncertainty, when to use one form over another, or what tools 
should be used to assess uncertainty (see Refsgaard et al. 2007). There is not 
even a consensus on the number of forms of uncertainty. Proponents of Bayes- 
ian statistics generally argue all uncertainty is a measure of belief irrespective 
of its source and probability is the best way to express it (O'Hagan & Oakley 
2004). At the other extreme are classification schemes that organize uncer- 
tainty by location, nature, and source that can result in dozens of unique types 
of uncertainty (van Asselt & Rotmans 2002; Walker et al. 2003). In practice, 
uncertainty representation may be based on extraneous factors, such as famili- 
arity, academic tradition, or ignorance of alternatives. Likewise, risk communi- 
cation concerns might eclipse epistemic concerns (Tucker & Ferson 2008). For 
example, an analyst might prefer mathematical simplicity over a slightly more 
informative, but less understandable, method of treating uncertainty. It may 
also be a good choice to err on the side of simplicity in cases where increasing 
analytic complexity could obscure lack of knowledge—‘a prescription that one’s 
analytical formulation should grow in complexity and computational intensity 
as one knows less and less about the problem, will not pass the laugh test in 
real-world policy circles’ (Casman, Morgan & Dowlatabadi 1999). 


Model uncertainty 


While parameter variability is often well-characterized in risk assessments, 
many forms of epistemic uncertainly are underestimated or ignored due to lack 
of appropriate methodology. Model uncertainty, also referred to as model form 
uncertainty or model structure uncertainty, is rarely acknowledged, and the 
few methods for quantitatively addressing this form of uncertainty are not uni- 
versally accepted (Ferson 2014). Whether and how to account for model struc- 
ture uncertainty is yet another epistemic value judgment. In some cases, model 
structure may be the most important source of uncertainty. For example, when 
five well-respected consulting firms generated groundwater pollution risk 
models based on the same field data, the result was conceptually unique models 
with no common predictive capability (Refsgaard et al. 2006). 
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Model structure uncertainty is more difficult to characterize than param- 
eter uncertainty because it is based on a lack of knowledge, rather than natural 
variability, and is less amenable to probabilistic representation. Often, a model 
structure is selected from a range of possibilities and then the modeler proceeds 
to account for parameter uncertainty while treating the model structure as a 
given (Draper 1995). The following are some existing strategies for addressing 
model structure uncertainty. 


Lumped uncertainty 


Where copious data exists, the modeler can use a split data set to first select the 
parameters during the model calibration phase and then evaluate the entire 
model during the validation phase. Deviations between the model output and 
the second data set can be partially attributed to model structure uncertainty. 
The structural uncertainty can then be accounted for by either increasing 
parameter uncertainty until it also accounts for structural uncertainty (this 
occurs in inverse modeling methods) or by adding an explicit structural uncer- 
tainty term—irreverently known as the fudge factor. The lumped uncertainty 
approach assumes the available data are reliable and representative and the 
underlying modeled processes are stationary. 


Sensitivity analysis with multiple models 


One of the mathematically simplest methods of addressing model structure 
uncertainty is to create multiple models that address the range of possible 
model structures. Each model is evaluated separately, and the output of all the 
models is summarized as a set of results. This approach was used in early cli- 
mate modeling projections. It has the benefit of being simple to understand, 
but it can become arduous if there are many models. Likewise, summarizing 
the many results and guessing the likelihood of particular model outcomes is 
less straightforward. Most importantly, this approach works on the substantial 
epistemic assumption that the full range of possible models has been described. 


Monte Carlo model averaging 


A Monte Carlo probabilistic model can account for multiple model forms 
by sampling each possible model and combining their results into a single 
probability distribution (Morgan, Henrion & Small 1990). The sampling can 
take advantage of resampling techniques, such as bootstrapping, and can 
even be weighted if some models are more probable. This method requires 
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all possible model structures be identified by the analyst. Another concern 
is averaged distributions will likely underestimate tail risks—the low- 
probability, worst-case events that occur at the tails of a distribution. More 
fundamentally, some analysts object to the idea of combining theoretically 
incompatible models. 


Bayesian model averaging 


Bayesian model averaging is similar to Monte Carlo averaging in that it com- 
bines all the identified potential model structures into an aggregated output 
(Hoeting et al. 1999). In this case, the model uncertainty, both parameter and 
structural, are evaluated together. Bayesian averaging has the same limitations 
as the Monte Carlo approach: completeness concerns, averaging incompatible 
theories, and underestimating tail risks. 


Bounding analysis 


Using bounding analysis, the results for all the potential models are compared 
and an envelope is drawn around the entire set of results. The final product is a 
single bounded region likely to contain the correct model output. A benefit of 
this method is all possible model structures need not be identified only those 
that would yield the most extreme outcomes. While there is no guarantee the 
analyst will be able to identify the extreme model structures, it is a simpler goal 
than identifying all possible models. Bounding analysis also avoids the issues of 
underestimating tail risks and averaging incompatible theories—it propagates 
rather than erases uncertainty (Ferson 2014). Weaknesses include the inability 
to weight the credibility of individual model structures and the inability to dis- 
tinguish likelihoods within the bounded region. 


Unknown unknowns 


Analysts frequently encounter situations where uncertainty is deep but rec- 
ognized. A variety of methods are available for dealing with extreme uncer- 
tainty in risk assessments (Cox 2012b). However, sometimes we do not even 
recognize our own ignorance. The phrase ‘unknown unknowns is relatively 
new, but the underlying concept is ancient—in Plato's Meno, Socrates points 
out one cannot inquire about a topic with which one is wholly ignorant. By 
definition, we are ignorant of unknown unknowns, so we generally exclude 
them from risk assessments. Nonetheless, approaches for reducing ignorance 
have been proposed, such as imaginative thinking and increased public dis- 
course (to be discussed in more detail in the last chapter) (Attenberg, Ipeirotis 
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& Provost 2015). While the uncertainty associated with ignorance is unquanti- 
fiable, acknowledging the limitations of our knowledge is a display of Socratic 
wisdom that improves risk communication (Elahi 2011). 

While there is uncertainty inherent in all knowledge, formal risk assessments 
tend to make specific, often quantitative, claims regarding the level of certainty 
of knowledge. Thus, special attention is warranted regarding the caveats placed 
in an assessment that reflect the analyst’s value judgments regarding what is 
known and knowable. An analyst may believe that an assessment has captured 
all the salient points worthy of consideration to a degree of accuracy and pre- 
cision that conclusively answers the question. This analyst is likely to present 
findings with few caveats. A more skeptical and humble analyst will add quali- 
fiers to assessments so readers do not over-interpret the results. 


Comparing Risks 


One reason to create a risk estimate is to compare it to other hazards or risk 
management options. The comparison process is full of value judgments. 
For example, one of the most common comparisons of a hazard is to natu- 
rally occurring levels of a potential harm, such as background radiation levels 
(Hansson 2003). Using natural exposure levels as a standard for comparison 
when there is scant reason to assume this constitutes an acceptable level of 
harm is a value judgment. However, the widespread use of sunscreen suggests 
the general public does not always find natural risks acceptable. Experiencing a 
level of harm by default does not imply technologies that subject us to similar 
levels of risk are acceptable (Fischhoff et al. 1978; Slovic 2000). 

Along the same lines, public concerns regarding various technologies, such 
as synthetic food additives or genetically modified foods, are sometimes based. 
primarily on the unnaturalness of the technology (Viscusi & Hakes 1998; Hans- 
son 2012). However, this assessment is based on the belief naturally occurring 
substances are safer. This is a reasonable assumption in the sense humans have 
co-existed with naturally occurring materials for a long time. This provides 
extensive experience helpful for forming judgments of safety. However, it is 
a naive generalization to assume natural equals safe. For example, arsenic is 
naturally occurring in the groundwater of some areas, and it poses a larger 
public health threat than many anthropogenic water contaminants. The basic 
assumption that natural substances are benign is even ensconced in US regula- 
tions; the FDA has far fewer requirements for botanical medicines (sold to the 
public as nutritional supplements) than for synthetic drugs. Even equivalent 
harms caused by natural sources are perceived to be less scary than human- 
caused harms (Rudski et al. 2011). The source of this distinction appears to be 
the ‘risk as feeling’ (Loewenstein et al., 2001) model and ‘affect heuristic’ (Slovic 
et al. 2007) which suggest perceptions of risk are dependent upon the emotions 
associated with the risk. 
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Incommensurability 


One of the most basic assumptions in risk assessments is the belief risks can be 
compared—even using the precautionary principle is an implicit comparison 
between the potential risk and the status quo. However, is it always possible to 
compare any risk? Are some risks incommensurable? Certainly, risks that are 
different in nature (such as health risks and risk of habitat loss) are difficult to 
compare (Espinoza 2009). Any such comparison requires the use of acommon 
unit of measure, such as economic value, or equally controversial subjective 
rankings. However, even risks that appear to be of the same kind (such as all 
risks that could shorten a human life) can still be difficult to compare due to 
important ethical distinctions. Public rejection of quantitative risk assessments 
in the past may not be due to risk communication failures but rather to the fail- 
ure of these formal assessments to account for ethical distinctions important 
to the public (Hansson 2005; Kuzma & Besley 2008). Some distinctions often 
ignored in quantitative risk assessments that do not share ethical equivalency 
include (Slovic 1987; Gillette & Krier 1990; Cranor 2009; Espinoza 2009) 


e natural versus anthropogenic risks; 

e detectable versus undetectable (without special instrumentation) risks; 

e controllable versus uncontrollable risks; 

e voluntary versus imposed risks; 

e risks with benefits versus uncompensated risks; 

e known risks versus vague risks (‘ambiguity aversion’ (Fox & Tversky 1995)); 

e risks central to people’s everyday lives versus uncommon risks; 

e future versus immediate risks; and 

e equitable versus asymmetric distribution of risks (in both space and time). 
Similar justice issues arise when the exposed, the beneficiaries, and the 
decision-makers are different groups (Hermansson & Hansson 2007; Hans- 
son 2018). 


In each pair above, the first risk type is generally preferred to the second 
risk type. In practice, risks often fit multiple categories and can be ordered 
accordingly. For example, common, self-controlled, voluntary risks, such as 
driving, generate the least public apprehension; whereas, uncommon, imposed 
risks without benefits, such as terrorism, inspire the most dread. Thus, it is 
no surprise US spending on highway traffic safety is a fraction of the spend- 
ing on counter-terrorism despite the fact automobile accidents killed about 
100 times more Americans than terrorism in the first decade of this century 
(which includes the massive 9/11 terrorism attack). Ignoring these risk 
distinctions can lead to simplified assessments that are completely disre- 
garded by the public. It is for this reason risk governance guidelines stress the 
importance of social context (Renn & Graham 2005). While the importance 
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of these distinctions has been understood for some time in theory (Jasanoff 
1993), there remains limited evidence of this occurring in practice (Pohjola 
et al. 2012). Because these ethical distinctions are often publically expressed as 
moral emotions (duty, autonomy, fairness, etc.), ignoring the emotional con- 
tent of risk assessment decreases both their quality and the likelihood they will 
be followed (Roeser & Pesch 2016). 


Risk ranking 


Some risk assessments may also employ a risk ranking method in the final 
comparison. The ranking may be quantitative or qualitative, single or multi- 
attribute, and can take many forms including, letter grades, number grades, 
cumulative probability distributions, exceedance probabilities, color catego- 
ries, and word categories. The ranking method selection is an epistemic value 
judgment with important risk communication implications (Cox, Babayev & 
Huber 2005; Cox 2008; MacKenzie 2014). Most risk rankings are simply based 
on probabilities and consequences, but rankings incorporating ethical dimen- 
sions, such as the source of risk (Gardoni & Murphy 2014), have been proposed. 
This generates a more nuanced, but also more subjective, form of ranking. For 
example, which is the most concerning: 1,000 deaths caused by heart disease, 
100 smoking-related deaths, or 10 homicides? The question is almost meaning- 
less when stripped of its context. Unfortunately, this is precisely what happens 
in a risk ranking exercise without accompanying qualitative descriptions. 


A Value Assumption Roadmap 


Given the myriad value assumptions discussed in this chapter, an aid is useful. 
The following list (Table 1) is organized chronologically in the risk assessment 
process so it can be used as a checklist for risk analysts. The list is not exhaus- 
tive in its coverage of potential value judgments, but it highlights common and 
contentious assumptions that, left unexamined and unaddressed, decrease the 
utility of a risk assessment. 

The roadmap of value assumptions ignores some of the more uncontroversial 
values inherent in risk assessment as well as broader value discussions within 
science. For example, the general debate over what constitutes quality science 
(Kuhn 1962; Feyerabend 1970; Lakatos 1970) is itself a value-based argument: 
‘The ethos of science is that affectively toned complex of values and norms 
which is held to be binding on the man of science’ (Merton 1942). However, it is 
not always clear what epistemic, ethical, or aesthetic values are considered to be 
uncontroversial or for how long they will remain uncontested (Hansson & Aven 
2014). Maximizing happiness is a common goal in contemporary economic 
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Table 1: A summary process map of value judgments in risk assessments. 


Step 


Fundamental Value Questions 


Selecting a topic 


e How are hazards screened? 
e What heuristics are influencing choice? 


Defining the e What is an appropriate time, space, and population? 
boundaries e Holistic or component analysis? 

Choosing the e What unit will be used? 

assessment form e What is a life worth? 


e What are deeply held values worth? 

e What discount rate should be used? 

Qualitative or quantitative? 

e Which definition of risk? 

e Maximizing efficiency or resiliency? 

e Focus on preventing false positives or false negatives? 


Model selection 


e Accuracy or precaution? 
e Theoretical or empirical? 
e Simple or complex model? 


Data selection 


e How is data screened? 
e How are rare events treated? 
e How is expert opinion used? 


Accounting for 
uncertainty 


e Deterministic or probabilistic? 
e Objective or subjective probabilities? 
e How is incertitude addressed? 


Comparing risks 


e Can the risks be compared? 
e Qualitative ethical distinctions? 
e Risk ranking? 


analyses. It might even seem reasonable to think of it as an uncontroversial 
value. Yet, different eras and cultures have valued duty over self-interest. 


Example: Risk assessment of farmed salmon 


To see the utility of the value judgments map, it helps to apply it to an 
actual risk assessment debate. In the following example, the original study 
found high concentrations of carcinogenic organochlorine contaminants 


$ For an overview of various ethical frameworks as they apply to risk assessment, see 
Rozell (2018). 
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in farm-raised salmon and concluded the risk of consumption outweighed 
the benefits (Hites et al. 2004). The analysis prompted a series of strong 
response letters. One letter pointed out even using a conservatively protec- 
tive US Environmental Protection Agency (EPA) linear cancer risk model, 
the expected number of cancer cases from consuming farmed salmon was a 
fraction of the number of cardiovascular mortalities that would be avoided 
by eating salmon (Rembold 2004). Furthermore, the critique noted the 
quality of the data was not the same; the cardiovascular benefits data was 
based on randomized clinical trials, while the cancer risk data was based on 
less reliable observational studies and nonhuman dose-response models. The 
response from the original authors raised the possibility of other non-cancer 
harms (i.e., neurological) from contaminated fish consumption, as well as a 
reminder that the beneficial omega-3 fatty acids found in salmon were avail- 
able from other non-contaminated dietary sources. 

A second letter also compared the cardiovascular benefits of salmon con- 
sumption to cancer risks and additionally included a value-of-information 
analysis to argue any uncertainty in the risk and benefit data did not affect 
the assessment salmon consumption was beneficial (Tuomisto et al. 2004). 
For this reason, the letter went so far as to imply that the original study was 
non-scientific. Again, the response pointed out fish were not the sole source of 
dietary cardiovascular benefits. 

A third letter questioned the EPAs linear dose-response model, citing addi- 
tional data that suggested there were no carcinogenic effects at low exposures 
(Lund et al. 2004). The response by the original authors argued the additional 
study used a sample size too small to detect the estimated cancer risk. Fur- 
thermore, they pointed out the potential neurological and cancer risk is larger 
for young people due to bioaccumulation, while the cardiovascular effects 
primarily benefit older individuals, which suggested the need to distinguish 
populations. 

Looking at the various critiques and responses, value assumptions were 
made corresponding to each step of the risk assessment process. Boundary 
value assumptions were made regarding what can be counted as a risk (cancer 
and neurological impairment) or a benefit (cardiovascular health) and whether 
sensitive populations (children) should be considered separately. Likewise, 
there were debates regarding the appropriate risk model (the EPA linear model 
or a no-effects threshold model); what data were worthy of inclusion in the 
assessment (observational studies, animal studies, and small sample size stud- 
ies); and how important it was to account for uncertainty. Finally, there was an 
important value judgment regarding risk comparison: will people who poten- 
tially stop eating salmon substitute other foods rich in omega 3 fatty acids (for 
example, should salmon be compared to walnuts)? While analyzing the value 
assumptions in the various risk assessments does not resolve the underlying 
scientific questions, it does help clarify the arguments and provide insight into 
what fundamental issues should be explored further. 
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Example: The avian influenza research debate 


Returning to the avian influenza research discussed in the first chapter, we 
can look at some of the risk assessments associated with the debate and see 
where the contentious value judgements arose. While the NIH requested both 
qualitative and quantitative risk-benefit assessments, the general belief among 
scientists was a quantitative assessment would be more credible. This was a 
dubious assumption given the wide range of results from previous attempts. 
For example, a simple probabilistic risk assessment from the biosecurity com- 
munity estimated research with potential pandemic pathogens would create 
an 80 percent likelihood of a release every 13 years (Klotz & Sylvester 2012). 
An updated assessment estimated the risk of a research-induced pandemic 
over 10 years to be between 5 and 27 percent (Klotz & Sylvester 2014). Mean- 
while, a risk assessment from the public health community estimated for 
every year a laboratory performed this type of research, there was 0.01 to 
0.1 percent risk of a pandemic, which would result in 2 million to 1.4 billion 
fatalities (Lipsitch & Inglesby 2014). When these results were presented at an 
NRC symposium, Ron Fouchier, the head of the original controversial study 
responded, ‘I prefer no numbers rather than ridiculous numbers that make 
no sense: 

Dr. Fouchier’s subsequent risk assessment estimated a lab-induced pandemic 
would occur approximately every 33 billion years (Fouchier 2015b). Since this 
is more than twice the known age of the universe, his calculated risk was essen- 
tially zero. Dr. Fouchier noted there has been no confirmed laboratory-acquired 
flu infections nor any releases in decades and this supported his conclusion the 
risk is now non-existent. Critics of his assessment questioned his methodology 
and selection of evidence (Klotz 2015; Lipsitch & Inglesby 2015). Dr. Fouchier 
responded with the same complaints (Fouchier 2015a). 

A similar question over appropriate evidence arose when proponents of gain- 
of-function research posited that the 1977 flu pandemic was caused by a vac- 
cine trial or vaccine development accident rather than a research laboratory 
release (Rozo & Gronvall 2015). They concluded ‘it remains likely that to this 
date, there has been no real-world example of a laboratory accident that has 
led to a global epidemic’ Critics have argued this nuanced position is largely 
irrelevant to the gain-of-function debate (Furmanski 2015). Because vaccine 
development is still a primary goal of gain-of-function research, a vaccine mis- 
hap is no less worrisome than a research lab release—an epidemic could occur 
regardless of which lab made the fatal error. 

An even more fundamental values debate over what constitutes evidence 
arises over the essential purpose of the research. Genetic analysis of influenza 
viruses cannot yet predict the functional behavior of a virus (Russell et al. 2014; 
M et al. 2016). Likewise, a specific sequence of mutations in the lab are not 
guaranteed to be the most likely to occur in nature. This limits the immediate 
value of the research for practical applications, such as vaccine design. Critics 
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of the H5N1 research argued the lack of immediate application was a reason 
not to perform the work. Proponents had a different view of the same facts. Not 
understanding the connection between genes and virus behavior was a gap in 
knowledge that must be closed and gain-of-function research was the best way 
to obtain this valuable information. As with many things in life, perspective 
is everything. 

The avian influenza debate also included value-laden risk comparisons. For 
example, Dr. Fouchier argued most of the biosafety and biosecurity concerns 
raised about his work also applied to natural pathogen research, which was 
not restricted—thus, gain-of-function research should not be restricted either 
(Fouchier 2015a). Temporarily ignoring the possibility that the consequences 
of an engineered virus release might be much greater, the underlying assump- 
tion of this argument is natural pathogen research is relatively safe. But, as pre- 
viously discussed, comparisons to nature are appealing but do not resolve the 
essential risk question. Instead, this argument could be flipped and interpreted 
as suggesting that research on naturally occurring pathogens may also require 
further restriction. 

Identifying these value judgments is useful in the difficult task of assessing 
the low probability-high consequence risk of a pandemic. Given the infre- 
quency of major influenza pandemics, Harvey Fineberg, former president of 
the US National Academy of Medicine, has noted, ‘scientists can make relatively 
few direct observations in each lifetime and have a long time to think about 
each observation. That is a circumstance that is ripe for over-interpretation 
(Fineberg 2009). 


Is Risk Assessment Useful? 


Risk analysis has become more challenging over time. For most of human his- 
tory, the assessment and management of risk was simply trial and error. How- 
ever, as the 20th century unfolded, the power and scale of technology grew, and 
it became evident that society needed to assess risks proactively. Science has 
also allowed us to recognize and measure more subtle hazards. Coupled with a 
general decreasing tolerance for risk in modern industrialized society, the chal- 
lenges to the field of risk analysis are considerable. 

Where risks and benefits are clear and certain, a formal risk assessment is 
generally unnecessary; the process is intended for contentious issues where 
the correct policy decision is less obvious. However, it is in these very situ- 
ations where the outcome of a risk assessment is strongly influenced by the 
many inherent value judgments (often unknowingly) made by the analyst. So 
are risk assessments useful? The short answer is yes. Even though this review 
of subjective values in risk assessment could be interpreted as a critique of the 
process, it is important to dispel any notion that ‘subjective’ is a derogatory 
term or it is necessarily arbitrary or based on a whim. Subjective values can be 
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deeply-held convictions with rational supporting arguments. Past neglect of 
value judgments, especially ethical considerations, has ignored the importance 
of emotional content in decision-making or treated it as merely coincidental 
(Roeser 2011). Appreciation of these value-laden assumptions as inherent to 
risk assessment improves both the process and the final assessment. 

Risk analysts who understand they are not merely collecting and tallying the 
facts surrounding a policy question will greatly improve their work. Yet it is no 
simple feat to maintain a skepticism and awareness of one’s own assumptions. If 
an analyst believes risk assessments produce objective answers, any assessment 
produced will understate its subjectivity and incertitude. Narrow risk assess- 
ments of well-understood phenomena with ample data might be uncontested, 
but formal risk assessments rarely resolve public debates regarding controver- 
sial science and emerging technologies. The various forms of value assump- 
tions made in risk assessments are a primary reason for the common inability 
of scientific studies to resolve disputes. Astute stakeholders and policymakers 
intuitively understand the limitations of risk analysis, and any assessment that is 
overly conclusive will be dismissed as deeply biased or naive. Considering and 
clearly discussing the value-laden assumptions in a risk assessment improves 
trust in the provided information by allaying concerns of hidden agendas. 

We still do not fully understand how risk assessment can be used to build 
consensus and reach decisions (Aven & Zio 2014). However, honest attempts to 
account for value judgments can aid rather than hinder public trust in a formal 
risk assessment where the ultimate goal is to provide information that is both 
useful and credible. 


Technological Risk Attitudes in 
Science Policy 


Science and technology policy debates frequently call for risk-benefit assess- 
ments based on sound science. But what constitutes credible science is a con- 
tentious issue in its own right (Yamamoto 2012), There is some consensus over 
what good science should look like in theory but much less regarding what it 
looks like in practice or in individual cases (Small, Güvenç & DeKay 2014). 
Furthermore, the science underlying many controversies is sufficiently com- 
plex and well-studied such that competing sides can take their pick of defen- 
sible science with which to argue their position (Sarewitz 2004). The result is 
formal risk-benefit assessments usually fail to resolve debates over controversial 
technology. The detailed arguments laid out in the prior two chapters hopefully 
provide a convincing case that the legitimate subjectivity found in risk-benefit 
assessments is unavoidable. 

Science and technology studies pioneer Sheila Jasanoff criticizes the modern 
attitude that difficult technological decisions are always resolvable with further 
research (Jasanoff 2007). She argues the most difficult parts of the decision 
process are often ethical and political—not scientific. Thus, delaying decisions 
for more fact-finding is often a form of wishful thinking. Calls to perform 
risk-benefit assessments for potentially dangerous technologies often fall into 
this category. 

In science and technology policy, we frequently encounter debates where well- 
intentioned and reasonable individuals still arrive at very different conclusions. 
This leads to the obvious question underlying the subjective dimensions of 
science and technology policy: given the same data, why do reasonable indi- 
viduals disagree? 

In any decision process, it is believed individuals tend first to evaluate novel 
information using mental shortcuts (heuristics) that are replaced by more rea- 
soned thinking as familiarity with the subject increases (Chaiken 1980; Kahne- 
man 2011). The process of evaluating science research and new technologies 
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is no different (Scheufele 2006). The heuristic-to-systematic thinking model 
also applies to interpreting risk assessments (Kahlor et al. 2003). New informa- 
tion can influence an individual's attitude, but any pre-existing attitude also 
influences how new information will be interpreted (Eagly & Chaiken 1993). 
Individuals with positive attitudes about technology will tend to expect more 
benefits from new technologies, while those with negative attitudes will expect 
fewer benefits (Kim et al. 2014). 

Many factors can influence attitudes about technology. The purpose here is 
to discuss the nature of technological risk attitudes that may account for why 
well-informed and reasonable people disagree about controversial research and 
emerging technologies. An outcome of this discussion is insight into the man- 
agement of controversial research (to be discussed in the following chapter). 


Technological Optimism and Skepticism 


Although attitudes regarding technological risk exist along a broad continuum, 
for our purpose it is helpful to define two general categories: technological 
optimism and technological skepticism. This simplification is useful because 
we are primarily concerned with policy formulation and the binary outcome 
of whether an individual endorses or opposes a particular line of research or 
technology—even the ambivalent and indifferent, which may constitute the 
majority of the general public (Seidl et al. 2013), will eventually (and perhaps 
unwittingly) fall into one category or the other. The dual categorization has 
also been used by other academic and popular writers (e.g., Costanza 1999, 
2000). However, these categories do not tell us why individuals have particular 
technological risk attitudes. But first, let us start by defining what we mean by a 
technological optimist or skeptic. 


Technological optimism 


Technological optimists believe in the liberating power of technology—mod- 
ern medicine liberates us from disease and space exploration liberates us from 
Earth. This attitude is captured in the modernist! movement and is still a com- 
mon attitude (e.g., Ridley 2010; Tierney 2010; Pinker 2018; Rosling, Ronnlund 
& Rosling 2018), particularly among Americans, with good reason. Over the 
past century, life expectancy has steadily increased worldwide with no signs of 
ending. Technological optimists have a basic faith in technology and require 
proof of harm to abandon any specific technology. An even more extreme 


! Modernism, with its various manifestations in Western philosophy, architecture, 
art, literature, and so on, is embodied in Ezra Pound’s command, ‘Make it new!’ 
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technophilia seems to be prevalent in Silicon Valley, where there is often effusive 
praise for all things related to the internet and related communications tech- 
nology. The underlying assumption is these innovations have and will continue 
to fundamentally restructure human interactions to everyone's benefit. Some 
of the expected future benefits are as yet unimagined based on the observa- 
tion that technology is frequently repurposed by its users in surprising ways. 
Technological optimists see future repurposing of technology as empowering 
and creative rather than potentially harmful. For technological optimists, even 
problems that are caused by technology have a technological solution (e.g., geo- 
engineering as a solution for global warming). 


Technological skepticism 


Technological skeptics reject the technology-as-panacea paradigm. This atti- 
tude is more closely represented in the postmodern era (along with small 
enduring enclaves of pre-modernists, such as the Amish) and is linked to some 
of the failings of modern industrialization. In particular, within the environ- 
mental movement, there has been an ongoing critique of modern Western 
society that includes a general aversion to technology (Leopold 1949; Carson 
1962; Naess 1973). However, the roots of technological skepticism date back 
to at least the early 19th century as the Industrial Revolution was underway in 
Great Britain. The Luddite rebellion, a brief spasm of violence between 1811 
and 1813, was a reaction to the social upheaval caused as the steam engine and 
power loom rapidly shifted the wool industry in central England from family 
cottage weavers to coal-powered mills run by a few adults and cheap child labor 
(Sale 1995). It is no coincidence Mary Shelley’s Frankenstein, a seminal example 
of technological skepticism, was published in London in 1818. The full history 
of technological skepticism is too rich to cover here, but some notable efforts 
have been made to trace the thread of this philosophy to modern times (Fox 
2002; Jones 2006). 

Modern technological skeptics are more likely to question and critique the 
work of scientists and engineers. These criticisms are sometimes met with the 
reflexive scornful label of ‘neo-Luddite; but the attitude is not associated only 
with those who shun technology. The ranks of technological skeptics include 
engineers (e.g., Joy 2000) who recognize that our present-day society is privi- 
leged and powerful due to technology but that this technology hangs over our 
heads like a sword of Damocles. 


Explaining the Differences 


If we accept that there are differences in technological risk attitudes that can be 
roughly categorized, we invite questions regarding the nature and measurement 
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of these difference. The following is a summary of some notable attempts, with 
varying degrees of success, to explain the differences in risk attitudes that are 
relevant to science and technology. 


Cultural Theory 


The cultural theory of risk proposes risk attitudes originate from cultural influ- 
ences rather than from individual calculations of utility as proposed by the 
rational choice theory, which undergirds traditional risk analysis. That is, indi- 
viduals filter information through cultural world views that determine their 
perception of risk. The theory categorizes four risk ideologies: laissez-faire 
individualists, social justice egalitarians, utilitarian hierarchists, and apathetic 
fatalists (Douglas & Wildavsky 1982). Individualists view human ingenuity 
as boundless and nature as robust. This roughly corresponds to technological 
optimism. Conversely, egalitarians, who roughly correspond to technological 
skeptics, view nature as fragile and have more precautionary views of technol- 
ogy. The hierarchists do not have strong attitudes regarding technology and 
are more likely to be ambivalent, but they do value authority, expertise, and 
the status quo (van Asselt & Rotmans 2002). Accordingly, hierarchists should 
be the most influenced by the risk statements made by authorities and experts. 
Lastly, the fatalists have little faith in the predictability of nature or humanity or 
in our ability to learn from past mistakes. Because fatalists doubt their capac- 
ity for control and self-determination, they often opt out of the policy-making 
process entirely. One implication of this classification scheme is risk analysts 
will tend to produce assessments that align with their own views and ignore the 
other cultural world views. 

Despite its theoretical elegance, cultural theory has had limited predictive 
success (Marris, Langford & O’Riordan 1998; Sjoberg 1998). For example, cultural 
world view accounted for only three percent of the variance in surveys measuring 
the perception of risks and benefits of childhood vaccinations (Song 2014). 


Psychometric measures 


Explaining risk attitudes via psychological models, such as the risk-as-feeling 
concept (Slovic et al. 2004) has been more successful than cultural theory but 
still has limited explanatory power in empirical studies (Sjoberg 2000). Psycho- 
metric studies have been useful for explaining why public risk perception often 
deviates from calculated risks (e.g., Slovic 1987). They have also been used to 
directly investigate attitudes about technology and have found the acceptability 
of a technology is explained not only by its risk, but also by its perceived useful- 
ness and whether it could be substituted with something else (Sjöberg 2002). 
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Additionally, technologies that tamper with nature and may have unknown 
effects are perceived to be riskiest. 

However, even an expanded set of psychometric characteristics accounts for 
only a portion of the variance in technological risk attitudes, indicating the 
origin of these attitudes is still not well understood. For example, a survey of 
perceived risk from the chemical industry in South Korea found less than 10 
percent of the variance was explained by cultural theory or psychometric meas- 
ures, which was less than even basic demographic factors, such as education 
level or gender (Choi 2013). 


Other theories 


Other theories for why people embrace or reject technology have been pro- 
posed. For example, cultural cognition theory is a hybrid of cultural theory 
and psychometric models (Kahan et al. 2006, 2010, 2015). Cultural cognition 
recognizes a set of psychological mechanisms that allow individuals to pref- 
erentially select evidence that comports with the values of the cultural group 
with which they identify (Kahan, Jenkins-Smith & Braman 2011). In practical 
terms, this means evidence from experts that share the same values as their 
audience are viewed as more credible.’ Likewise, information is more likely to 
be accepted if it can be presented in such a way as to be consistent with existing 
cultural values (Kahan 2010). Most people will reject new information if it can- 
not fit within their existing worldview or that of their social group. 

This is consistent with psychologists’ wry description of humans as ‘cognitive 
misers’ (Fiske & Taylor 1984). Most of us are not inclined to assess the veracity 
of every claim we encounter. But we cannot blame it all on innate laziness. To 
take the time to gather evidence and also acquire the expertise to evaluate the 
evidence can only be done on occasion—no one can be an expert about eve- 
rything. Thus, most knowledge rests on a certain degree of faith in its source. 
This is where heuristics are used. We tend to believe claims made by people 
we personally trust or who have already been vetted by society—people with 
prestigious academic or corporate affiliations; people who are famous or are 
endorsed by famous people; or barring any other cues, people who are physi- 
cally or emotionally appealing to us. This can cause trouble. For example, John 
Carreyrou’s 2018 book, Bad Blood, recounts the hopeful rise and fraudulent fall 
of the blood testing company Theranos. The story serves as a cautionary tale for 
future technologists but also as a stark reminder of how the bandwagon effect 


* This point is particularly useful for science communication. Many scientists have 
found more success through direct local engagement with the public rather than 
lecturing them via social media. 
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of our reputational system can catastrophically fail us when respected individu- 
als are deceived by commanding personalities. 

Other work in social psychology and sociology has also explored the inter- 
dependence of beliefs and how social groups can become intellectually closed 
off due to extreme skepticism of contradictory evidence provided by outsid- 
ers. However, the timing and interaction of influence between individuals and 
their social groups can be complex to model (Friedkin et al. 2016). Perhaps, as 
a result, cultural cognition theory has similar explanatory power to its parent 
theories. This hybridization may be the best of both worlds but still does not 
fully explain the source of risk attitudes. 

Another theory comes from philosophers of technology. Framed in terms 
of trust, technological optimism and skepticism can be viewed as trusting or 
distrusting the reliability of technology as well as trusting or distrusting our 
ability to use technology appropriately (Ellul 1964; Heidegger 1977; Kiran & 
Verbeek 2010). 

Sociologist Daniel Fox ascribed technological optimism to fatigue with the 
political process and considered it a misguided desire to resolve seemingly 
intractable social problems: “The rejection of politics among intellectuals often 
takes the subtler form of what I call technocratic solutionism. Experts who 
practice solutionism insist that problems have technical solutions even if they 
are the result of conflicts about ideas, values and interests’ (Fox 1995: 2). Under- 
standing which problems require or are amenable to technological solutions is 
not always obvious. For example, conventional wisdom has usually attributed 
famines to food production failures—a technological problem. However, econ- 
omist Amartya Sen found that famines in the past century occurred during 
times of adequate food production; the real culprits were hoarding and high 
prices brought about by bad governance and corruption—a political problem 
(Sen 1981). 

Another idea not previously explored is the application of social psycholo- 
gist Jonathan Haidt’s theory of the ethical differences between liberals and con- 
servatives (Haidt 2012). He argues there are six basic themes in moral thinking: 
prevention of harm, fairness, liberty, loyalty, authority, and sanctity. Further- 
more, differences in political ideologies can be traced to which of these themes 
are emphasized. Specifically, American liberals judge the morality of actions 
primarily by two characteristics: does it cause harm and is it fair. Conserva- 
tives, on the other hand, appear more concerned with the additional criteria of 
loyalty, authority, and purity. For example, the purity concept is central to the 
‘wisdom of repugnance’ championed by bioethicist Leon Kass, the chairman of 
President Bush’s President’s Council on Bioethics (Kass 1997). 

This difference in conceptions of morality could partially explain why tech- 
nological attitudes do not clearly align with political ideology. While both 
groups would be concerned about potential harm, it would be a priority for lib- 
erals. This might account for the liberal skepticism for technologies with poten- 
tial environmental impacts but optimism for social media technology that is 
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perceived to encourage equality. Likewise, conservatives might be more likely 
to accept military technologies that reinforce authority but reject medical tech- 
nologies that offended their sense of sanctity. 

This idea of the non-alignment of political ideology with technological 
acceptance is also captured in sociologist Steve Fuller’s contention that the old 
left-wing/right-wing political ideologies are being supplanted with ‘proaction- 
ary (Fuller & Lipinska 2014) and ‘precautionary’ attitudes about science and 
technology that approximately correspond to technological optimism and 
skepticism, respectively. Dr. Fuller revives terms coined decades ago by futur- 
ist FM-2030 (born Fereidoun M. Esfandiary) that the left-right political dis- 
tinction is becoming an up-down technological risk distinction—‘up-wingers’ 
embrace technological change as a source of opportunity and ‘down-wingers’ 
are more concerned about the risks of technological hubris. More importantly, 
the new ideological dichotomy is not merely a rotation of the old groups but 
a reordering. Up-wingers would be expected to come from the technocratic 
left and the libertarian right, while down-wingers would encompass environ- 
mentalists from the left and religious conservatives from the right. In terms of 
figureheads, think Elon Musk versus Pope Francis. 

While these new labels provide some new insight into cultural trends, they 
remain just as coarse as the old ones and are no better at describing the com- 
plexity of individuals nor the source of their attitudes. For example, it is easy to 
imaging a person who is extremely supportive of medical research, cautiously 
optimistic about geoengineering as a temporary solution to climate change, 
but highly skeptical of pesticides, GMOs, and the value of social media.* How 
would we classify that person's technological ideology? 

In the end, none of the individual theories presented here alone offer con- 
vincing explanations for the variation in attitudes about technological risk. The 
most persuasive trend is that multidimensional measures of risk perception 
tend to have more explanatory power than single factor explanations (Wilson, 
Zwickle & Walpole 2019)—that is, risk perception is complex. However, this 
collection of theories does give us a general idea of the range of factors at play 
in the formation of technological risk attitudes. Given our present inability to 
explain their origins, it may be more helpful to shift our focus from why these 
attitudes exist to the question of whether and how technological risk attitudes 
change over time. 


How Do Technological Risk Attitudes Change? 
How static are the ideas of technological optimism and skepticism? This 


question is important if we are to ascertain whether an event or additional 


° Well, at least it is easy for me to imagine. 
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information, such as a research study or risk assessment, is likely to have a sub- 
stantial policy impact. Of course, influencing attitudes about technology is not 
as easy as simply presenting new information. Even risk communication meant 
to reduce concerns can have the unintended effect of raising fears about related 
risks among skeptics (Nakayachi 2013). 

First, it appears cultural attitudes regarding technology change over time. The 
general public of the early and mid-20th century saw a steady stream of techno- 
logical wonders: radio, air conditioning, plastics, automobiles, airplanes, anti- 
biotics, television, nuclear power, and so on. The technological optimism fueled 
by this parade of inventions perhaps culminated with a man walking on the 
moon. Subsequent generations have failed to witness such large scale techno- 
logical spectacles, leading to a concern by some that society can no longer solve 
big technical challenges (Pontin 2012). Nonetheless, steady medical advances 
and the internet age have reignited technological optimism in certain segments 
of society. The futurist Ray Kurzweil has predicted artificial intelligence will 
exceed any human intelligence by 2030. His predictions of other computing 
milestones were considered unrealistic in the past, but his track record of suc- 
cess has made him mainstream in Silicon Valley such that he was hired as a 
director of engineering at Google in 2012. 

Trends in technological optimism and skepticism can also be traced through 
science fiction literature. The romantic era of early science fiction, which encom- 
passed the second half of the 19th century, envisioned the utopian potential of 
technology. For example, Jules Vernes’ submarine Nautilus in 20,000 Leagues 
Under the Sea is a technological marvel used to explore and catalogue nature, 
aid the oppressed, and oppose militarism. Subsequently, the first few decades 
of the 20th century, dubbed science fiction’s ‘radium age’ (Glenn 2012), saw 
a trend toward technological skepticism. Aldous Huxley’s Brave New World 
(1932) is the epitome of the era. Like all literature, dystopian science fiction is 
a product of its time. Walter Miller’s A Canticle for Leibowitz (1959) and Kurt 
Vonnegut’s Cats Cradle (1963) were inspired by the threat of Cold War nuclear 
annihilation. More recent trends have focused on biotechnology, such as Mar- 
garet Attwood’s Oryx and Crake (2003). 

While prevailing attitudes about technology have changed over time, they 
also vary geographically. One comparison of US and British attitudes regard- 
ing analog computing technology prior to World War II argues the US may 
have surpassed British efforts due to cultural differences regarding resistance 
to change and general enthusiasm for technological innovation (Bowles 1996). 
Likewise, emerging nations are considered more optimistic regarding the abil- 
ity of technology to solve problems than developed nations with strong envi- 
ronmental movements (Gruner 2008). 

The variability of technology attitudes within an individual may be just as 
complex as temporal and geographical trends in society. Unlike the cultural 
theory of risk, the risk-as-feeling psychometric framework allows for vari- 
able attitudes about technology within a person. This agrees with our personal 
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experiences where we may encounter individuals who are technological opti- 
mists in some fields of science, such as the potential for medical technology 
to improve people's lives, while being technological skeptics in others, such as 
blaming modern communications technology for loss of privacy. 

It also seems reasonable that dramatic personal experience could substan- 
tially change an individual's technological risk attitude. The use of pharmaceu- 
ticals and medical devices, such as artificial joints, have become commonplace 
and can substantially improve quality of life. This not only makes technology 
more familiar, it can also greatly increase the technological enthusiasm of the 
person dependent on the technology. This appears to be the case for the tech- 
nology theorist and transhumanism advocate Michael Chorost, whose hearing 
was restored by a cochlear implant (Chorost 2005, 2011), or Neil Harbisson, 
whose total color-blindness has been augmented with a sensor that converts 
not only color, but also IR and UV wavelengths to sound allowing him to detect 
things invisible to the human eye. 

Conversely, technological optimism may evolve into skepticism when a job is 
made obsolete through technology. Travel agents, typists, toll booth attendants, 
telephone operators, video store clerks, and photo lab employees are just a few 
of the many careers that quickly appeared, seemed as though they would always 
exist, and then just as suddenly disappeared due to disruptive technological 
change. Technology-induced obsolescence of this nature is now even affecting 
many of the skilled professions (Rifkin 1995; Rotman 2013). 


Principle of Dangerous Science 


So far we have discussed the origins and flexibility of technological risk atti- 
tudes. Despite the lack of a comprehensive theory, it appears these attitudes are 
influenced by a variety of factors that include culture, feelings, and personal 
circumstances. They also appear to be malleable over time at both the indi- 
vidual and societal level. Do these observations have any implications for how 
science and technology policy decisions are made? Based on the complexity of 
technological risk attitudes, can we make any general statements? 

First, let us start with the additional observation that few lines of research 
or emerging technologies have been banned or abandoned in the past for rea- 
sons unrelated to science or practicality. Many controversial lines of research 
and technologies, such as the practice of lobotomies, have been abandoned 
for lack of scientific value and better alternatives.* However, the list of scien- 
tifically valid research that was stopped for moral reasons or social concerns is 
relatively short and consists primarily of weapons technology; internationally 


* The inventors of the procedure were awarded the 1949 Nobel Prize in Physiology or 
Medicine, which reflects rather poorly on our ability to predict the long-term value 
of research. 
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banned research includes biological and chemical weapons research as well as 
weather modification for military purposes (the 1978 Environmental Modifi- 
cation Convention). Likewise, the list of highly restricted research—for exam- 
ple, embryonic stem cell research in the US—is also relatively small. Despite 
substantial ethical concerns or public opposition, a wide range of controversial 
and non-essential research activities are currently legal in most places. 

In general, the number of technologies and lines of research banned for ethi- 
cal or risk perception reasons is small enough to suggest a general principle of 
dangerous science. The principle is simple: research is not banned solely for 
moral reasons. No line of research or technology will be forbidden until it has 
been scientifically discredited or deemed impractical or a better alternative has 
been accepted in its place. The implications of such a principle are substantial if, 
as sociologist John Evens (2018) argues, the vast majority of conflicts between 
religion and science are moral debates, not disagreements over knowledge. 

A precursor to this principle is the premise if something can be done that 
appears to have some intellectual or economic value, someone will view it as a 
valid idea and proceed before anyone else beats them to it. This then leads to 
the principle that posits we generally do not stop research just because it may 
be unwise, potentially harmful, or otherwise ethically dangerous. The result 
is dangerous science generally moves forward until something eliminates the 
controversy. The controversy can be eliminated in one of several ways: new 
information or extensive experience reduces the perceived danger, cultural 
acceptance decreases opposition, or an alternative technology eliminates the 
need for the research. 

Underlying this principle is a pervasive attitude of inevitability surrounding 
technology in modern society. The sense technology controls us as much as 
we control technology is described by social scientists as ‘technological deter- 
minism (Bimber 1994). Popular books on the philosophy of technology (e.g., 
Arthur 2009; Kelly 2010) adopt this mindset when they describe the evolution 
of technology and occasionally use the term in the biological sense. By using 
multiple meanings of evolution, these technologists reveal their mental image 
of technology as independent and alive and perhaps uncontrollable. Scholars 
of the field of science and technology studies have long argued technology, by 
its very anthropogenic nature, is embedded with human values (e.g., Jasanoff 
2016). Technology requires intentional and unintentional value-laden deci- 
sions in its creation, and the more complex the technology, the longer the string 
of decisions. The result is technology is quite controllable if we make thoughtful 
and compelling choices to do so. Thus, technological inevitability may really 
be a method for remaining uncritical about the results of human endeavors. 
In a 1954 hearing, J. Robert Oppenheimer, the lead physicist of the Manhattan 
Project, admitted an Achilles heel of scientists left to their own devices: “When 
you see something that is technically sweet, you go ahead and do it and you 
argue about what to do about it only after you have had your technical success. 
That is the way it was with the atomic bomb’ Perhaps the primary force behind 
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technological inevitability is merely the pull of an intellectual challenge heed- 
less of the consequences. Compounding the problem is the centrality of tech- 
nology to our lives, which exerts a powerful force on what activities we believe 
are possible, our behaviors, and our expectations. This can support the illusion 
the present path of innovation is the only option. 

Continuing with this biological metaphor, technological optimists think of 
technology as benignly natural and inevitable, while technological skeptics see 
it as aggressively virus-like and relentless. If the history of technological pro- 
gress gives the appearance of supporting the concept of inevitability, techno- 
logical skeptics see this inexorable pull as more ominous: ‘we are free at the first 
step but slaves at the second and all further ones’ (Jonas 1984: 32). 

The emphasis on describing the trajectory of technological progress also 
has implications for policy making. With an assumption of inevitability, 
technological discussions tend toward predicting what will come next rather 
than discussing what should come next. While there are academic communities 
working in specific subfields of technology ethics, such as bioethics, the broader 
technology ethics community is surprisingly small considering the criticality 
of technology in modern society.’ Likewise, innovation literature is primarily 
focused on how to encourage innovation rather than normative discussions 
of where innovation should lead. The idea of ‘responsible innovatior has only 
started to gain traction in this century (Guston et al. 2014). If we fail to actively 
direct science research and technological innovation, what at first glance feels 
like technological determinism is actually ‘technological somnambulism 
(Winner 1986). 

The principle of dangerous science appears to be particularly true for emerg- 
ing technologies, which are noted for rapid change. When the time between 
basic research and usable technology is very short, the process of societal reac- 
tion, evaluation, and public policy formulation often lags the rate of technolog- 
ical progress. Likewise, because the research and its technological applications 
can be nearly concurrent, available empirical evidence may be of limited value 
in estimating the future trajectory of the technology (Chameau, Ballhaus & Lin 
2014). In the absence of convincing data or tractable theoretical frameworks, 
policymakers tend to favor cautious permissiveness until more compelling evi- 
dence is available. There is a pervasive fear of limiting technology development, 
especially in the face of marketplace competition, international military rivalry, 
industrial anti-regulatory lobbyists, and science proponents raising alarms of 
innovation deficits. No one wants to be seen as hampering scientific progress 
because of speculation or slippery slope arguments (even if they are not unwar- 
ranted concerns). Permissiveness, at least in the US, is often a defensible default 


° There isn’t even agreement on whether there should be more. While some are 
alarmed by the paucity of bioethicists working in the field of dual-use biotechnology 
(Rappert & Selgelid 2013), others argue the professionalization of bioethics removes 
important discussions from public debate (Cameron & Caplan 2009). 
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position because regulatory agencies are frequently unprepared and lack juris- 
diction until public sentiment pushes policymakers to react. 

Here, we finally come back to the idea of competing technological risk atti- 
tudes that exacerbate technological inevitability in policymaking. Technologi- 
cal optimism and skepticism function as opposing ideologies, which tend to 
limit the use of appeals to common values as compelling reasons for technology 
regulation. This creates a society where there is plenty of ethical assessment 
of science and technology going on—institutional review boards, national and 
international science ethics committees, governmental regulatory agencies, et 
cetera—yet the collective oversight tends to narrowly define its mission toward. 
preventing only the most egregious ethical lapses. The result is a sympathetic 
ethical review process that gets caught by surprise when occasional research 
causes public alarm and the obligatory post hoc regulatory scramble. 

The following four examples further explore the principle of dangerous sci- 
ence within the context of contemporary emerging technologies. The first three 
serve as examples of the principle in action, while the fourth is a potential 
counterexample. 


The Principle in Action 
Womb transplants 


While organ donation is a well-established lifesaving procedure, non-essential 
organ transplants are more controversial. In 2014, the transplantation of a 
uterus from a post-menopausal woman to a woman of child-bearing age 
without a functional uterus resulted in the birth of a premature, but other- 
wise healthy, child (Brännström et al. 2014). Several bioethicists noted the 
procedure was not medically necessary and was rather dangerous. It required 
transplanting not only the uterus but also much of the surrounding uterine 
vasculature during a lengthy surgery. The recipient then had to live with the 
risks associated with immunosuppressant treatments and the potential of 
transplant rejection (Farrell & Falcone 2015). Furthermore, uterine transplants 
are intended to be temporary, and a second surgery is required to remove the 
donated uterus so the patient can discontinue immunosuppressant therapy 
after successful childbirth. While the surgeons involved appeared to be deeply 
concerned for the safety and mental well-being of their patients, who were 
desperate to experience pregnancy, there was also the unsavory appearance of 
a race for personal scientific glory to be the first to successfully perform the 
procedure (Orange 2014). 

Fundamentally, the question of this research comes down to whether one 
believes it is acceptable to perform risky non-essential surgery for personal, 
cultural, or religious reasons. While medically-unnecessary procedures are 
performed on a regular basis, the level of risk to the patient in this case is 
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substantial. Furthermore, from the perspective of the child, transplant preg- 
nancies have much higher health risks. Bioethicist Julian Savulescu has argued 
for a principle of ‘procreative beneficence’ that requires parents to maximize 
the health of their future children. From this perspective, having a child via 
uterine transplant should be avoided because it is a higher risk to the child’s 
health compared to alternatives (Daar & Klipstein 2016). Nonetheless, propo- 
nents argue the work passes the test of clinical equipoise—the potential benefits 
exceed the risk of unintended harm to the mother, child, and donor (Testa, 
Koon & Johannesson 2017). 

The principle of dangerous science suggests this line of research will con- 
tinue, especially once it was proven successful. The first success outside of Swe- 
den occurred in Texas in November 2017, which further encourages medical 
professionals elsewhere. However, given the considerable expense, the high 
failure rate, and the availability of alternatives, such as surrogacy or adoption, it 
is likely the procedure will remain relatively uncommon. It is also interesting to 
note Swedish surgeons are world leaders in uterine transplants—eight success- 
ful childbirths had occurred by the end of 2017. Their success is partially due to 
a lack of alternatives because Sweden discourages the use of surrogate mothers, 
which is, ironically, deemed unethical.° 

While the procedure is unlikely to be banned for ethical concerns, womb 
transplants may be eventually abandoned due to impracticality. The true end of 
the procedure will likely come when scientists are able to grow a new functional 
uterus from a woman’s own stem cells—a technically (and ethically) superior 
alternative. 


Human genome editing 


While gene manipulation using recombinant DNA techniques is decades old 
technology, new techniques and recent research have led to proposals of germ- 
line editing (making genetic changes in embryos that are inheritable). This 
has raised concerns over potential multi-generational harm as well as a public 
backlash against meddling with nature. In early 2015, the British parliament 
approved three-person in vitro fertilization (IVF), where a third person pro- 
vides mitochondrial DNA to correct mitochondrial diseases. Some see this as 
a reasonable first step but larger scale genome editing must await confirmation 


€ A growing number of governments agree that surrogacy is exploitation—particularly 
when poor foreign women are paid to rent their wombs. Surrogacy has been banned 
in some jurisdictions and restricted to altruistic or family surrogacy in others. At 
first glance, this appears to be a potential exception to the principle of controversial 
research. However, the problem is not with the science itself, which is commonly 
used in fertility clinics, but with the legal contract of surrogacy. Or maybe this is 
splitting hairs. 
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it is safe (Baltimore et al. 2015). Although a 2016 US budget bill banned FDA 
funding for human embryo heritable genetic modification, the first success- 
ful mitochondrial replacement therapy by three-person IVF occurred in New 
York in 2016, resulting in the birth of a baby boy (Zhang et al. 2017). Because 
mitochondria are inherited from mothers, the technique does not result in 
heritable genetic modification in male children, although there appear to be 
rare cases of mitochondrial DNA inheritance from fathers (McWilliams & 
Suomalainen 2019). However, when the New York case was published in an 
academic journal, there were concerns regarding questionable institutional 
review board approval, the level of informed consent, and the potential health 
effects of small amounts of diseased mitochondria transferring to the child 
(Alikani et al. 2017). Even scientists with substantial financial stakes in the new 
gene editing technology have argued parents should use genetic testing and 
screening options instead, reserving the new technology for the treatment of 
disease in somatic (non-germline) cells (Lundberg & Novak 2015). 

Germline editing has been compared to the initial concerns regarding IVF, 
which was initially controversial but rapidly became a widely used medical 
technology. This may be an appropriate comparison. In 2018, a clinic in Kiev, 
Ukraine, acknowledged publicly it was offering three-parent IVF as a treatment 
for infertility. This new application of mitochondrial replacement had already 
resulted in the birth of four children, one of whom was female and may pre- 
sumably pass on her genetic modification to her children. 

Chinese scientists were the first to attempt germline editing with non-viable 
embryos obtained from fertility clinics to test the potential for curing a fatal 
genetic blood disorder (Liang et al. 2015). While the initial study had a low 
success rate, the technology rapidly improved (Liang et al. 2017). Given the 
vast number of genetic diseases that appear amenable to treatment by genome- 
editing (Ma et al. 2017), cautious support for this research appears to be growing. 
A 2017 report from the Committee on Human Gene Editing of the National 
Academies of Science and Medicine shifted the general expectation from total 
prohibition to limited approval (National Academies 2017b). Likewise, surveys 
seem to indicate, within a few short years, the majority of the public has already 
accepted the idea of inheritable genome edits to prevent disease (Scheufele 
et al. 2017). 

Some critics have called for a complete ban, claiming germline editing con- 
stitutes ‘a return to the agenda of eugenics: the positive selection of “good” ver- 
sions of the human genome and the weeding out of “bad” versions, not just 
for the health of an individual, but for the future of the species’ (Pollack 2015). 
Others are concerned germline editing will cause the public to restrict all gene 
editing activities. However, history and the principle of dangerous science sug- 
gests otherwise. 

There have been few calls in the US for moratoria in the biological sciences: 
recombinant DNA research in 1974, human reproductive cloning in 1997, 
and influenza gain-of-function research in 2012. Of these three moratoria 
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over the past 40 years, the recombinant DNA moratorium was lifted within 
a year (Berg & Singer 1995). The third moratorium on gain-of-function 
research was lifted in 2017 as detailed in the first chapter. The human clon- 
ing moratorium is still unofficially in effect for federally-funded research; 
however, human cloning was never actually banned in the US despite legisla- 
tive attempts in 1997, 2001, 2003, and 2005. The moratorium was originally 
inspired by a spate of large animal cloning successes (for example, the birth of 
Dolly the sheep in 1996). However, once it was found defect rates were high 
in the cloned animals, commercial interest in cloning large animals dimin- 
ished. Nevertheless, the research continued, and within 20 years a Korean 
laboratory began marketing pet dog cloning services. Even with a low success 
rate that requires many donor and surrogate dogs, as well as a $100,000 price 
tag, the ethically questionable lucrative niche market supports the principle 
of dangerous science. 

In early 2018, the first successful cloning of primates was announced in 
China. The intent of cloning macaques was to improve medical testing through 
the use of genetically identical animals. Successful cloning of a primate suggests 
the hurdles to human cloning are negligible. In the end, the US ban on human 
cloning did not stop research from walking right up to the forbidden line. If a 
new biotechnology has a use with tempting benefits, moral concerns provide 
a weak barrier. 

In late 2018, Chinese researcher He Jiankui announced he had edited the 
genome of recently born twin girls to prevent expression of the white blood 
cell receptor protein CCR5 with the intent of preventing HIV infection. The 
edit emulates a natural variant, CCR5-A32, which provides HIV-1 resistance 
to some European populations. It is posited this variant confers resistance to 
some types of viruses but may also increase susceptibility to other viruses, such 
as West Nile and influenza. Given the lack of understanding of the implications 
of this particular edit, and CRISPR’-mediated germline editing in general, the 
work was considered extremely premature (Ledford 2019). Along with wide- 
spread condemnation by the international scientific community, an interna- 
tional germline editing moratorium was proposed (Lander et al. 2019). Dr. He 
was terminated from his university position, a government investigation was 
launched against him, and the possibility of criminal charges were discussed. 
The work even led to an investigation of a Stanford University post-doc advisor 
to Dr. He who, despite being cleared of any wrongdoing, became the subject 
of an unflattering New York Times article. However, even after the intensity 
and breadth of all this criticism, a Russian scientist subsequently announced 
interest in performing a similar germline editing procedure (Cyranoski 2019). 


7 Since first proposed in 2012, precision gene-editing tools based on CRISPR (clus- 
tered regularly interspaced short palindromic repeats) DNA found in prokaryotes 
have greatly advanced the genetic engineering abilities of scientists. 
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Synthetic biology 


Regardless of whether one considers it a new field or an old one with a new 
name, synthetic biology has experienced a similar level of promise and con- 
troversy. Although rather amorphous and evolving, the field entails purposeful 
genetic manipulation coupled with the engineering philosophy that creation 
is more instructive than observation (Roosth 2017). The field has two comple- 
mentary approaches: editing existing genomes with tools such as CRISPR-Cas9 
or the more tedious process of experimentally building ‘minimal genomes’ 
(Hutchison et al. 2016) from scratch that contain only essential and functional 
genetic material. The engineering mindset of synthetic biology also emphasizes 
problem solving. The techniques have already been applied to the production 
of biofuels, pharmaceuticals, and food additives. 

Applications started with viruses and bacteria before moving on to simple 
eukaryotes. For example, the Synthetic Yeast Genome Project (Sc2.0) has a goal 
to re-engineer and simplify the 12 million base pairs of DNA in the 16 chro- 
mosomes of baker’s yeast, Saccharomyces cerevisiae (Richardson et al. 2017) so 
it may be used as a miniature factory on which to base a myriad of other func- 
tions. The expectation is that plants and animals are the next step. 

The field brings to reality some seemingly fantastical possibilities, including 
the creation of organisms resistant to all viral infections (Lajoie et al. 2013)—a 
useful trait in bioreactors but also a major competitive advantage in nature—or 
even the creation of new forms of life (Malyshev et al. 2014). With it comes 
immense commercial potential (Gronvall 2015) as well as a host of new social 
and ethical considerations (Cho et al. 1999). 

Expert opinions on synthetic biology vary widely. Technological optimists 
claim detailed modification of genetic sequences will result in decreasing 
unintended consequences. Likewise, synthetic species that cannot mate with 
wild species should, in theory, greatly reduce the likelihood of unintended 
gene transfers or uncontrolled releases. Meanwhile, technological skeptics fear 
unpredictable emergent properties, scientific hubris, and expanding biosecu- 
rity threats. With the potential for substantial benefits and harm, questions 
have been raised regarding whether the current regulatory systems and avenues 
of risk assessment are sufficient (Lowrie & Tait 2010; Kelle 2013; Drinkwater 
et al. 2014; National Academies 2018a). However, despite the reoccurring 
fears of scientists playing God or references to Frankenstein, it seems, much 
like with the human cloning debates, the primary concern of bioengineering 
is maintaining the specialness of human life (van den Belt 2009). As long as 
that line is not crossed, the synthetic biologists will probably be allowed to 
tinker on. 

The many uses already imagined for synthetic biology in just the fields 
of medicine and manufacturing alone suggest a future that could be much 
healthier, cleaner, and more energy-efficient. Of course, like all powerful sci- 
ence, one persons commercial imperative is another person's irresponsible 
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pursuit. One emblematic example is the successful bioengineering of yeast 
to convert sugar into morphine (DeLoache et al. 2015; Fossati et al. 2015; 
Galanie et al. 2015). The original work could not generate commercially 
viable production levels, but to critics the research suggested home-brewed 
heroin and all its associated social problems were inevitable (Oye, Lawson & 
Bubela 2015). Opioid bioengineering would appear to be a classic example 
of the principle of dangerous science in action. The work proceeded in the 
midst of a raging opioid addiction epidemic in the US, and the general timing 
of announcements from competing research teams suggested the artificial 
urgency of a race for personal glory. Given the plentiful supply of raw 
materials and ease of manufacturing, there was no public health call for more 
accessible opioids (McNeil Jr 2015). Interviews with the researchers suggested 
they believed their work would help address existing pain-management crises 
in less industrialized countries. However, any pain medication shortages in 
those countries have been primarily caused by policy decisions rather than 
unaffordable medication. 

Despite the obvious potential for misuse, the research was funded because 
opiates derived from yeast rather than opium poppies could potentially reduce 
opioid production costs. Benefits to areas, such as Afghanistan, where opium 
poppy production provides funding for military conflicts also served as fur- 
ther theoretical justification. On top of this, the researchers created a biotech 
start-up, Antheia, to ramp up production efficiency and, perhaps to mollify 
their critics, suggested the potential for creating less addictive opioids in the 
future. Besides private investors, Antheia’s work was funded by the NSF, NIH, 
and the National Institute on Drug Abuse. Economic benefits aside, questions 
arise regarding the likelihood an improved process can be kept out of the ille- 
gal opioid market. Likewise, substituting poppy-derived opioids only impacts 
legal farming in India, Turkey, and Australia. The illegal opioid market fueled 
by Afghanistan would be unaffected unless the illegal market also converted 
to the new process. The end result is a formidable regulatory and public health 
challenge if the technology succeeds. 


Autonomous weapons 


Technologies developed specifically for military purposes, such as bioweapons 
research, appear to serve as the counterexample to the principle of dangerous 
science. Research on specific classes of weapons are one of the few types of 
research banned in the past.* However, it could be argued weapons research 


* Efforts can be legislative or legal actions. For example, efforts in the US to sell plans 
to print untraceable 3D plastic guns has been thwarted by multiple court injunc- 
tions. However, the primary obstacle remains the technical skill and expensive 
equipment still required to create a functional weapon. 
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bans are merely a special case in the sense the beneficial uses of weapons (e.g., 
deterrence) are limited and can often be accomplished with less controversial 
alternative technologies. It should also be noted some weapons research has 
continued in secret even after the research was banned, such as the bioweapons 
program in the former Soviet Union (Leitenberg, Zilinskas & Kuhn 2012; Sahl 
et al. 2016), thus conforming to the principle of dangerous science.’ 

One contemporary example is the field of autonomous weapons. This cen- 
tury has seen an explosion in the number of drones equipped with lethal 
weapons. Robotic weapons in various stages of development now include the 
option of fully autonomous operation (Chameau, Ballhaus & Lin 2014). Pro- 
ponents argue autonomous weapons can minimize civilian casualties, while 
critics are concerned these systems will lower the social cost of going to war 
and will eventually be used universally by both police forces and terrorists 
(Russell 2015). There have been efforts to ban autonomous weapons through 
the Convention on Certain Conventional Weapons led by computer scien- 
tist Noel Sharkey, who chairs the nongovernmental International Commit- 
tee for Robot Arms Control. Its mission statement includes the reasonable 
premise ‘machines should not be allowed to make the decision to kill peo- 
ple? Another effort, the evocatively named Campaign to Stop Killer Robots, 
was launched in 2013 to raise public awareness about lethal autonomous 
weapons and to persuade prominent scientists and technologists to publicly 
denounce such systems. 

Autonomous weapons systems may have the best chance of being banned 
solely on moral grounds for two reasons. First, a history of international weap- 
ons treaties—the 1968 Non-Proliferation Treaty, the 1972 Biological Weapons 
Convention, and the 1993 Chemical Weapons Convention—suggest weap- 
ons technologies create a unique broad consensus regarding their risk-benefit 
assessment. Second, the 1997 Ottawa Treaty, which bans landmines, creates a 
specific precedent for banning (albeit simple) autonomous weapons. 

Whether autonomous weapons are actually banned remains to be seen. 
Recent books on the subject tend toward pessimistic inevitability while still 
offering warnings and hope that humans will decide otherwise (Singer 2009; 
Scharre 2018). However, prospects are dimming as technology advances and 
the extraordinary becomes familiar. Countries leading the rapid development 
of this technology—including the US, the United Kingdom, and Israel—have 
argued against any new restrictions. In 2018, a research center for the Con- 
vergence of National Defense and Artificial Intelligence was opened in South 


? Likewise, the US Defense Advanced Research Projects Agency has funded research 
to create food crop horizontal gene transfer systems (as opposed to vertical transfer 
through inheritance) that use gene-editing viruses delivered by insects. The pur- 
ported intent is to protect crops within a growing season rather than waiting to 
protect subsequent crops. However, the simplest use of this problematic technology 
is as a bioweapon (Reeves et al. 2018). 
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Korea despite protests from computer scientists in other countries suggesting 
that work by a respected research university was incrementally moving auton- 
omous weapons into the realm of normality. In a 2016 Agence France-Presse 
interview, computer scientist Stuart Russell stated, ‘I am against robots for 
ethical reasons but I do not believe ethical arguments will win the day. I believe 
strategic arguments will win the day: 


Implications 


So how might the principle of dangerous science guide debates over emerging 
technologies? One observation is that technological optimists have history on 
their side. They need not overhype new technology because it is unlikely to be 
banned or heavily regulated until there is overwhelming evidence of harm. On 
the contrary, more voice could be given to the technological skeptics without 
fear of stifling innovation. 

For example, one particularly energetic defense of new gene editing tech- 
niques, such as CRISPR-Cas9, called for bioethicists to ‘Get out of the way’ 
(Pinker 2015). However, such advice is misplaced. Institutional review boards 
are not the main culprit delaying life-saving technology. Increasing research 
funding would improve far more lives than streamlining bioethics reviews. 
Of course, criticizing the obscure work of review boards is easier than simply 
asking taxpayers for more money. History suggests gene editing techniques 
are far too useful to be limited by more than cursory regulation. Once these 
powerful techniques becomes ubiquitous, a more realistic concern is they will 
be difficult to control and society will have to depend upon the moral wisdom 
of bioengineers. 

Coming back to the avian influenza research discussed in the first chapter, are 
there some further insights available from the discussion of technological risk 
attitudes? Discrepancies in biotechnology risk estimates have been attributed to 
two competing narratives on technology development: biotech revolution and 
biotech evolution (Vogel 2013). The revolution narrative is a dystopian form of 
technological determinism that assumes biotechnology development experi- 
ences rapid inevitable progress (Smith & Marx 1994). This view predominates 
the biosecurity community and has heavily influenced US biosecurity policy 
(Wright 2006). The evolution narrative is based on a sociotechnical model of 
technology development and takes the view that biotechnology is built on slow 
and incremental innovation (Jasanoff 2004; Nightingale & Martin 2004). The 
revolution narrative roughly equates to the skeptical technological risk attitude, 
while the evolution narrative is a more benign and optimistic technological risk 
attitude. The biotech evolution narrative also forms the basis for incremental- 
ism, an argument frequently used by proponents of research freedom. Each 
published paper is usually a small addition to the corpus of scientific knowl- 
edge. Thus, if previous papers were not restricted, then why limit the next one. 
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This is why regulating entire lines of research rather than individual projects 
can be a more effective approach (Rappert 2014). 

At the December 2014 NRC symposium on H5N1 research, biodefense 
policy expert Gregory Koblentz, discussed the propensity of risk attitudes to 
dominate risk assessments where there are sparse data. In the context of bios- 
ecurity risks, the optimists believe bioterrorism risk is exaggerated because few 
past terrorist attacks used bioweapons, terrorists tend to use more readily avail- 
able weapons, and the technical obstacles are significant. Conversely, pessi- 
mists believe bioterrorism risk is understated because past uses of bioweapons 
show terrorists are innovative, terrorist acts have become increasingly lethal 
over time, terrorist ideologies embrace mass casualties, and technical obstacles 
are decreasing with time. The parallels of these opposing views to the broader 
categories of technological optimism and skepticism are clear. The dichotomy 
of views is partially due to the limited historical record of biological weapons, 
which leaves a great deal of room for interpretation (Boddie et al. 2015; Carus 
2015, 2017). More importantly, these opposing views result in very different 
and controversial risk management strategies. 

To summarize, in the absence of convincing evidence, technological risk atti- 
tudes often guide decision making. For the purposes of policy formulation, we 
can roughly separate these attitudes into technological optimism and skepti- 
cism which acknowledges reasonable people can reach opposing conclusions 
based on the same available data. These labels are pragmatic descriptions in the 
absence of a theory that satisfactorily explains the origins of technological risk 
attitudes. An observation from a review of these attitudes is it is unclear how to 
change them. They are influenced by many factors, and any single assessment 
is unlikely to change risk attitudes and thereby resolve a controversial policy 
debate. History also suggests research is rarely abandoned for moral reasons. If 
technological skeptics want to be more effective in making their case for cau- 
tion, they need to form pragmatic arguments for their position and propose 
alternative solutions that address existing technological needs. 


Managing Dangerous Science 


So how might knowledge of the inherent subjectivities of risk-benefit assessment 
and the principle of dangerous science guide the management of science and 
technology? Before recommending improvements to the assessment and 
management of dangerous science, it is helpful to first review and critique some 
ways research has been managed in the past. One reason new approaches are 
needed is the changing nature of modern science. 


The Changing Face of Technological Risk 


Much of the life science research discussed in this book represents a relatively 
new category of research—technologies that are both accessible and powerful. 
Whereas chemical technology was the focus of dual-use concerns in the first 
half of the 20th century and nuclear technology in the last half, biotechnol- 
ogy is the most significant challenge to science policy in the 21st century. The 
primary reason is many hurdles in the biosciences are now more theoretical 
than practical—material resources are rarely the limiting factor. Laboratories 
around the world, including an increasing number of small unaffiliated labs, 
can now create or order with relative ease what was only possible at a handful 
of state-of-the-art facilities a few years before (Suk et al. 2011; Adleman 2014). 

A personal anecdote demonstrates my point. In 2018, I attended a local 
elementary school science fair in a New York City suburb. Among the typical 
experiments displaying molding breads and growing plants, there sat a fifth- 
grade experiment titled ‘Biohack? With the assistance of a parent (a trained 
biologist), the student had used a mail-order CRISPR-Cas9 kit (Sneed 2017) 
to alter a non-pathogenic E. coli bacteria to become resistant to the antibiotic 
streptomycin. While this relatively harmless gain-of-function experiment was 
essentially a recipe for amateurs to follow, the fact a talented fifth-grader was 
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able to successfully perform detailed genetic engineering at home suggests 
the potential pervasive power of this technology. Unfortunately, in 2017, 
authorities in Germany—where biological research is more strictly regulated— 
discovered a mail-order CRISPR kit from the US contained multidrug-resistant 
pathogenic E. coli. The risk to healthy users was deemed to be low, but the 
product was banned in Germany. Any regulators with qualms about the DIY- 
bio movement now had evidence to support their misgivings. 

The critical component in many of today’s emerging technologies is knowl- 
edge. Once the information is publicly available, the ability to create both 
immensely beneficial and harmful biotechnology becomes almost ubiquitous. 
The biggest remaining barrier is bridging the gap in tacit knowledge—the 
assumed professional knowledge, ignored or hidden details, and essential labo- 
ratory skills that are not recorded in the academic literature. However, even 
these hurdles are decreasing as modern communication lowers the cost of 
detailed knowledge transfer and as increasing numbers of experienced biotech- 
nologists migrate from lab to lab (Engel-Glatter 2013; Revill & Jefferson 2013). 
Furthermore, many of the highly skilled and technical steps are being removed 
through automation. Together, these factors point toward a field that is rapidly 
deskilling (Roosth 2017), with all its associated implications for the careers of 
biologists and public biosafety. Managing potentially harmful biotechnology 
requires a fundamental shift in thinking from nuclear weapons nonprolifer- 
ation policy, which historically focused on controlling materials as much as 
knowledge (Moodie 2012). 

Furthermore, managing biotechnology materials are far more difficult than 
nuclear materials because they reproduce and are not easily detected. For 
example, a transgenic orange petunia developed in Germany in 1987, but 
never sold to the public, was found in 2015 to be commercially available unbe- 
knownst to regulators and breeders. The result was a 2017 request to destroy 
an untold number of petunia varieties in the US because they contained DNA 
from cauliflower mosaic virus, which is listed as a plant pest. While the physical 
danger posed by these flowers is minimal, the regulatory fallibility they represent 
is not. 

One outcome of the new power of biotechnology is the biological science 
community is now finding it increasingly difficult to self-regulate. Instead of 
just worrying about harming patients, institutional review boards are now 
confronted with the reality of ‘bystander risk (Shah et al. 2018) and unknown 
implications to society. The H5N1 research example hints at the difficulty of 
reactively managing dangerous science. After research has been conducted 
and technologies developed, responses tend to be ad hoc and difficult to imple- 
ment. This raises important questions of how to proactively manage technolo- 
gies that have a high potential for negative consequences but are difficult to 
govern because the technologies are already widespread and knowledge, rather 
than materials, is the limiting factor (Miller et al. 2009). Suggested approaches 
for these cases focus on informal governance measures that strengthen the 
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international culture of responsible research (National Academies 2018b) 
and rely on moral persuasion methods, such as codes of conduct, educational 
outreach, transparency, and whistle-blowing support (Tucker 2012). One 
clever suggestion has been to use the patent system as a tool for regulating 
gene editing while in its early stages (Parthasarathy 2018), but this works best 
for large organizations; the DIY culture is less amenable to formal methods of 
oversight. Harnessing the power of social norms may be one of the few means 
of guiding the DIY-bio community (Nyborg et al. 2016). However, this requires 
a community where nearly everyone is doing safe experiments, and they 
consistently ostracize those who are perceived to be reckless. It is also helpful if 
trained scientists are actively engaged with citizen scientists to ensure they are 
using the safest methods available. 

Community self-monitoring through actions, such as anonymous reporting 
systems, can substantially reduce obvious avoidable accidents. However, they 
appear to be an inadequate response to threats of intentional misuse because 
bad actors often attempt to hide their work. Another avenue being pursued is 
the use of AI automated screening tools for ordered DNA sequences. Again, 
this assumes bioterrorists, rather than working in the shadows, will be using 
commercial DNA synthesis to speed up their misdeeds. 

While there is little consensus on how likely the actual bioterrorism threat 
may be, it is difficult to monitor and control bioterrorists within a largely self- 
regulated community. One cannot expect virtuous behavior from scientists and 
engineers who, by definition, have less than virtuous intentions. One example 
is the deadly 2001 anthrax letters that appeared to have originated from an 
anthrax biodefense research scientist who worked at a highly monitored US 
Army research facility on a military base (DOJ 2010). The lack of direct evi- 
dence, even after the prime suspect’s suicide, suggests our ability to counter, or 
even detect, the rare ‘mad scientist’ is insufficient. Furthermore, just because the 
emphasis in the H5N1 debate eventually shifted to accidental release, there was 
no reason to dismiss the real threat of H5N1 bioterrorism (Inglesby & Relman 
2016). Given the few viable options for managing low-governability technolo- 
gies, involuntary monitoring of research activities by government intelligence 
agencies is a more likely future scenario. 


Common risk management strategies 


There are a few general criteria traditionally used to manage risk. First, one can 
attempt to maximize utility—seek maximum benefits for the minimum risk— 
which naturally leads to assessment by risk-benefit analysis. Second, one can 
attempt to solely minimize risk (zero risk being ideal) which leads to adopting 
a precautionary approach to technology management. Precautionary technol- 
ogy bans are a more common choice for extreme risks with deep incertitude. 
However, precaution does not require complete bans; it can also take the form 
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of cautiously deliberative support (Kaebnick et al. 2016). Lastly, one can take 
a pragmatic technology-based risk minimization approach. This is often used 
in pollution control, where regulations call for ‘as low as reasonably achievable’ 
(ALARA) or ‘best available control technology’ (BACT). The first and third 
approaches are science-based strategies most useful when data are plentiful. 
However, these approaches are of less value for emerging dangerous science. 
Precautionary management is more easily applied to these situations, but the 
fear is it stifles innovation and progress. 


Critique of traditional risk management 


Predicting the future is a popular pastime with generally limited success. Unan- 
ticipated events with no historical precedent occur with surprising frequency 
(Taleb, Goldstein & Spitznagel 2009). Yet we tend to be overconfident in our 
assessment of risk and good at rationalizing inadequate risk management after 
the fact (Paté-Cornell & Cox 2014). Some of the excuses for poor risk man- 
agement include claiming uncontrollable forces (acts of God), unimaginable 
events (black swans), rare confluences of events (perfect storms), lack of prec- 
edent, excusable ignorance, conformance to standard practice, someone else’s 
responsibility, lack of legal liability, or even operator error. Post hoc risk reduc- 
tion is so riddled with biases that realistic reflection and improvement is decep- 
tively hard. The difficulty warrants adopting a more cautious risk management 
attitude that emphasizes the incompleteness of formal risk assessment and pro- 
motes a general skepticism toward quantitatively bounding the certainty of our 
knowledge (Aven & Krohn 2014). While it is important to acknowledge the 
limits of human prediction, superficial or non-existent risk assessment is much 
worse and promotes a slide into technological fatalism. 

Previous efforts to improve the objectivity of science and technology assess- 
ments for policy decisions have proposed variations on the ‘science court’ 
(Kantrowitz 1967; Field Jr. 1994). The idea is to have science experts argue their 
position in front of a disinterested scientist from another field who would act 
as a judge or mediator. This would separate the judge and advocate positions as 
is common in the US judicial system. The proposal is an attempt to be proac- 
tive considering research controversies will eventual find their way to the legal 
system if not adequately addressed elsewhere. 

Coming back to the case study in the first chapter, a series of newsworthy 
biosafety lapses at CDC laboratories in the summer of 2014 were key events 
leading to the moratorium on H5N1 virus gain-of-function research. These 
breakdowns in safety at a well-respected institution raised alarm among the 
public and policymakers. The standard assurances research was being con- 
ducted safely were now questionable. The result was a precautionary backlash 
typical of a ‘popular technological assessment’ (Jasanoff 2003). Such public 
responses often appear to ignore scientific evidence. However, they are not 
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simply due to scientific illiteracy or incompetence,’ but rather a reaction to 
sophisticated analyses that lack credibility (NRC 2015). Calls for precautionary 
bans were not restricted to regulators and the public. Biosecurity experts also 
saw them as a reasonable response to the threat of avian influenza research 
gone awry (Klotz & Sylvester 2012). 

Because the research was not banned outright, one cynical interpretation 
of the entire debate was the various moratoria were politically necessary until 
public attention turned elsewhere and the final risk-benefit assessment was a 
fig leaf to justify business as usual. It was noteworthy the formal deliberations 
were not affected by a 2015 avian influenza outbreak in the US—the largest in 
decades. While the outbreak only affected poultry, the geographic immediacy 
and economic losses should have refocused public attention on the debate, but 
its impact was minimal. Given the underlying culture of technological inevita- 
bility prevalent in the US, perhaps the outcome is unsurprising. Nonetheless, 
the resulting minimally restrictive research policy for avian influenza was still 
a substantial shift from only a decade ago when both the NRC and NSABB 
broadly supported essentially complete self-regulation in the biosciences— 
a sentiment still popular in the EU where the European Academies Science 
Advisory Council still prefers to defer to the professionalism of the scientific 
community (Fears & ter Meulen 2016). 

It is also important to note these moratoria only applied to research funded 
by the US government. It would be optimistic to assume US policy on poten- 
tially pandemic pathogen research carries as much weight as the Recombinant 
DNA Advisory Committee, which issues guidelines that are mandatory only for 
NIH-funded research but have become widely accepted (Resnik 2010). While 
the influence of federal funding within the global research community is still 
substantial, it is shrinking. It is not inevitable that even closely aligned commu- 
nities, such as the European Union, will adopt US policies. This independence 
was evident in the conflicting NSABB and WHO assessments of H5N1 virus 


' This is commonly referred to as the deficit model, where any lack of public support 
is assumed to be based on ignorance and the solution is more explanatory lectures 
(Mooney 2010). Meanwhile, surveys suggest educational attainment is not strongly 
correlated with support for controversial science policy questions (Funk, Rainie & 
Page 2015). While more widely accepted today, the first discussions of the underly- 
ing hubris of scientists (Feyerabend 1978) were treated as blasphemous. One could 
argue in our post-truth, ‘fake news’ modern world, the pendulum has swung too 
far the other way and science has lost its authoritative voice (Nichols 2017; Crease 
2019). Yet the long-running General Social Survey finds Americans have had con- 
sistently high confidence in the scientific community for the past 40 years during 
a time when trust in most public institutions has steadily fallen. Climate change 
denial may be particularly alarming, but it is only one of many anti-science minority 
positions throughout history that has eventually collapsed under its own improb- 
ability. While the uncertain nature of knowledge is undeniable, so is the human 
tendency to overplay uncertainty when personally convenient. 
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research. Neither should the US expect the rest of the world to follow its lead 
given its own rather inconsistent commitment to multilateralism. For example, 
the US is the only United Nations member not party to the Cartagena Proto- 
col on Biosafety or the more general Convention on Biological Diversity—two 
international agreements that regulate the transfer and use of living genetically 
engineered organisms. 

Additionally, given the competitive nature of NIH funding and the decreas- 
ing cost of advanced labs, the proportion of life scientists in the US working 
independently of federal funding is growing. Newer philanthropic organiza- 
tions, such as the Gates Foundation, are joining stalwarts like the Howard 
Hughes Medical Institute to provide a substantial source of research funding. 
Silicon Valley has also been luring life scientists from academia with the prom- 
ise of better resources and an emphasis on results rather than grant-writing and 
publications. 

The limited reach of US regulators within the global research community has 
been used as an argument for limited regulation—we cannot limit NIH-funded 
scientists for fear they will simply go elsewhere to perform their dangerous 
research. For example, China is expected to have multiple BSL-4 labs before 
2025. This is a precarious argument. It represents the same Cold War attitude 
that seemed to compel an arms race of unquestioned nuclear weapons research. 
It also makes an implicit assumption of superior ethical sophistication the US 
science research community may no longer deserve (Sipp & Pei 2016). 

While gain-of-function research on potentially pandemic pathogens is cur- 
rently conducted solely in well-funded labs with strict biosafety measures, 
the issues that underlie this debate are even more serious in other research 
communities. For example, synthetic biology has a thriving do-it-yourself 
independent lab sub-culture (Keulartz & van den Belt 2016) of amateurs and 
entrepreneurs working without regulatory oversight, funding, or training in 
biological fields. The danger extends beyond a lack of minimal biosafety train- 
ing to active high-risk attention-getting behavior (Baumgaertner 2018; Zhang 
2018). In the absence of new regulations that restrict who can access this tech- 
nology (or maybe even in spite of them), a tragedy and public backlash are 
nearly inevitable (Green 2016). 


Engaging multiple perspectives 


A common prescription for high uncertainty risk management is to encourage 
substantial public involvement. The public is critical to risk analysis because it 
serves as a source of ‘moral imagination (Johnson 1993), which lets us explore 
the consequences of our actions and imagine the plight of others (Coeckelbergh 
2009). Conscious effort to enlarge the risk assessment process is necessary 
because of the pervasive myth of technocracy—the idea that only specialists 
are equipped to assess and manage technological risk decisions (Jasanoff 2016). 
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As an added benefit, a discursive approach that emphasizes mutual education 
can also be used for well-known risks the public consistently underestimates or 
overestimates (Renn & Klinke 2004). 

However, the idealized model of public participation, which includes a well- 
informed and representative public, is difficult to achieve (Rossi 1997; Lavery 
2018). Public engagement, while widely considered a prerequisite to good risk 
management, is not guaranteed to be useful or sufficient. Sometimes increas- 
ing the number of stakeholders in a decision process can lead to increased 
confusion and general intransigence. Citizen participation in technical pol- 
icy decisions can be disappointing in practice because the participants start 
off overwhelmed by technical information and end up overwhelmed by their 
inability to judge conflicting expert assessments. Given the variety of epistemic 
and ethical value assumptions inherent to the risk assessment process, evaluat- 
ing technological risk disputes is difficult enough for scientists already familiar 
with the technology, let alone the general public who must also absorb addi- 
tional background information. One way to mitigate these issues is to provide 
science literacy mini-courses to public participants that discuss the types of 
scientific disputes and the contingent nature of science and scientific consensus 
(Reaven 1987). 

Ironically, the need for scientific literacy training extends to scientists. While 
most working scientists have formal training in the methods and conduct of 
science, few are educated in its philosophical underpinnings. This is probably 
why some scientists are ambivalent or even dismissive when science policy 
issues arise. Despite the difficulties, it is important to engage the scientific com- 
munity. To ensure accountable science, dissenting scientists must have open 
channels for voicing concerns where they need not fear retribution. Likewise, 
scientists should be encouraged to reflect on their work rather than assuming 
the existing regulatory process, such as institutional review boards, will catch 
any ethical or safety issues. Jennifer Doudna, one of the discoverers of CRISPR- 
Cas9, initially hesitated getting involved in the science policy surrounding her 
work: ‘I told myself that bioethicists were better positioned to take the lead on 
such issues. Like everyone else, I wanted to get on with the science made pos- 
sible by the technology’ (Doudna 2015). Eventually, she changed her mind and 
advocated for scientists to become prepared for dealing with the larger implica- 
tions of their work. 

That said, the attention of experts tends to be concentrated on their area of 
expertise. Scientists will have a natural tendency to permissively promote sci- 
ence, while security experts will tend to conservatively restrict perceived risks 
(Klotz & Sylvester 2009; Selgelid 2009). Because the science research commu- 
nity is largely self-regulated (with notable exceptions in nuclear research), this 
tends toward minimal regulation. Although a robust dialogue within the sci- 
entific community is healthy, ‘risk is more a political and cultural phenomenon 
than it is a technical one’ and unelected ‘privileged experts’ should not dictate 
the terms of technological risk debates (Sarewitz 2015). 
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Conversely, self-regulation has some benefits. Security specialists are trained 
to focus on signs of malicious intent but may be less cognizant of unintended 
harm. Signs of impending accidents and other unintentional harms may best be 
detected by the members of the research community who are fully immersed in 
the field and understand the formal procedures and tacit knowledge associated 
with the research. This may be one reason that the NSABB, whose members are 
primarily scientists, became increasingly concerned with biosafety risk issues 
in the H5N1 debate even though the board was created to deal with biosecurity 
concerns (Fink 2004). 

Despite a desire for a ‘broad deliberative process; the NSABB and NRC eval- 
uations were largely headed by experts in virology and public health. Even the 
public comments were public only in the sense they came from outside of the 
organizations. For example, comments from the November 2014 NSABB meet- 
ing consisted of six public health specialists, a virologist, and a policy analyst. 
Furthermore, some of the most vocal experts had a vested personal interest in 
the outcome of the assessment. This led to some confusing science communica- 
tion. For example, when first presenting their findings, Dr. Fouchier and oth- 
ers made bold claims that influenza transmissibility in ferrets and humans was 
nearly equivalent. However, after the public backlash and subsequent threats 
to the viability of the research, Dr. Fouchier made much weaker claims about 
the use of ferrets as human surrogates and downplayed the transmissibility 
and lethality of the engineered virus (Kahn 2012; Lipsitch & Inglesby 2015). 
Whether intentional or not, this appeared to be a case of miscalculation in risk 
communication (Sandman 2012). That is the initial announcement emphasized 
the danger of the H5N1 research to attract attention from peers and future 
funding. However, when the public heard the same message and panicked, sub- 
sequent media interactions attempted to minimize the danger of the work. This 
is a good example of the balancing act in dangerous science: scientists must 
appear to be cutting-edge while at the same time appearing safely incremental. 

The concerns of narrow interests and bias led to calls for more active pub- 
lic engagement in the deliberative process; a reasonable request considering 
that most of the world’s population was potentially affected by H5N1 research 
(Fineberg 2015; Schoch-Spana 2015). Of course, there is an asymmetry in 
global effects. The risks are the lowest and the potential benefits are the great- 
est to individuals in nations with the best access to modern health care (Quinn 
et al. 2011). Thus, the people with the most to lose and the least to gain were 
largely absent from the discussion. One letter to the NSABB chair voiced con- 
cerns the deliberative process did not include enough risk assessment experts, 
was not sufficiently international, did not have enough public comment oppor- 
tunities, was generally opaque and moving too fast, and had an inherent con- 
flict of interest by being funded by the NIH (Roberts & Relman 2015). While 
the NSABB acknowledged the need for more public input and many of the 
meetings were open to the public, the deliberations were poorly publicized and 
remained meetings of experts. 
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So how do we achieve public science? 


Science policy making is a political and ethical process and should not be left 
entirely to the scientific community. ‘Scientists may approach their research 
with the best of intentions and the good of society in mind, but it would be a 
mistake to assume that they could know what is best for the world—and a trag- 
edy to foist that burden on them’ (Evans 2013). So what is the solution? Engag- 
ing multiple perspectives increases the likelihood important considerations in 
risk assessment and viable options for risk management will not be missed. 
However, despite the global reach of modern science, truly inclusive public 
engagement at this scale is rarely attempted. The few previous attempts at inter- 
national-scale public participation have failed to demonstrate any impact that 
justified the considerable effort, expense, and time involved (Rask 2013). 

Some alternative forms of technology assessment have been proposed that do 
not rely solely on the scientific community. The most popular effort has been 
the creation of prediction markets (Mann 2016) that estimate the likelihood 
a technology will succeed or fail. Prediction markets could easily be adapted 
to estimate technological risk. A prediction market offers shares to the public 
that pay out only if an event occurs. The trading price of the shares reflect the 
collective belief of the likelihood of the event. As the trading price rises toward 
the payout amount, the estimated likelihood rises toward 100%. Likewise, a 
share price near zero reflects the public wisdom the event is unlikely to happen. 
Proponents of prediction markets appreciate its simple interpretation, the ease 
of participation, and its successes in predicting past political events compared 
to expert analysis or opinion polls. 

However, critics of applying prediction markets to science policy have argued 
science rarely consists of distinct near-term events with clear outcomes like 
sports, an election, a daily stock price, or a quarterly earnings report. Thus, it is 
inherently difficult to sustain the large market needed to generate useful results; 
most investors are uninterested in tying up their money by placing bets on 
ambiguous events in the distant future. Likewise, prediction markets share the 
same manipulation issues as other financial markets. For example, a scientist 
privy to insider information could be tempted to game the market for personal 
financial gain. These shortcomings suggest prediction markets might actu- 
ally make science policy worse (Thicke 2017). Another alternative, which uses 
predictions polls within forecasting tournaments (Tetlock, Mellers & Scoblic 
2017), sidesteps the market manipulation issue, but the difficulty remains of 
trying to fit amorphous science policy into a method that works best for short- 
term, specific, and unambiguous events.’ 


? One could also argue short-term well-defined science is also the most likely to have 
commercial value. This is precisely the type of science that may already be getting 
outside scrutiny through venture capital or the stock market. 
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Another idea for broader technological risk assessment is to require sci- 
entists to obtain liability insurance for their research (Farquhar, Cotton- 
Barratt & Snyder-Beattie 2017). The strength of this approach is that it 
ignores the rather pointless exercise of attempting to quantify research ben- 
efits and concentrates on risks using a mature industry that specializes in 
risk assessment. Furthermore, requiring a financial outlay for potentially 
dangerous research would encourage responsible behavior—perhaps more 
than any other single action. However, the insurance industry does not have 
the best track record for estimating and adequately insuring a range of rare 
but substantial risks (for example, mortgage risk in 2008). Although many 
insurance companies offer policies for events that are difficult to estimate 
(terrorism, cyberattacks, etc.), the mere existence of these policies does not 
prove the risks are well-known or adequately insured. Ultimately, the largest 
issue is that scientific liability insurance rests on the premise there is a price 
for everything. But what if we cannot agree on a price due to conflicting 
interpretations of evidence or the research violates societal values that 
cannot be priced? 

Outsourcing risk assessment to the financial industry brings up another 
important point common to all technological risk assessment. How do we 
engage the public so research does not further exacerbate inequality? It is well 
recognized the benefits of science and technology are heaped upon the residents 
of highly industrialized nations and trickle down to the rest of the world, if at 
all. Unfortunately, the risks of technological innovation are not distributed in 
the same fashion. The world’s poorest are often disproportionately subjected 
to the highest environmental and health risks from modern science. So why 
do they get a small or non-existent seat at the table when technological risks 
are assessed? 

The current model of public science assessment stems from the 1975 
Asilomar conference on recombinant DNA. Less than 150 people—mostly 
scientists—convened at a small conference center on the Pacific coast to draw 
up safety guidelines for the new technology. The meeting focused on biosafety 
risk assessment, specifically excluding biosecurity issues or related social and 
ethical issues. Many scientists have pointed to Asilomar as a successful model 
for self-governance because of the subsequent lack of recombinant DNA catas- 
trophes. Yet the scientific community regularly bemoans widespread public 
rejection of some of the results of this technology, such as genetically modified 
foods.’ Proponents of the Asilomar model of science policy have missed the 


° The public’s ‘anti-science’ rejection of GMOs is not solely concerned about the tech- 
niques themselves, but rather specific modifications. To lump all GMOs together 
merely allows each side to selectively demonstrate the unreasonableness of the 
opposition. The same issue appears to be occurring with newer gene-editing tech- 
niques. This is a problem inherent to debating and regulating techniques rather than 
outcomes. 


Managing Dangerous Science 87 


crucial point that there was a much larger discussion to be had and the public 
was the most important stakeholder. After all, what good is safely conducted 
science if you are not allowed to put it to use? 

Subsequent controversial technologies have suffered the same fate. How 
many people even knew there was an International Summit on Human Gene 
Editing in Washington DC in December 2015? Probably a lot fewer than the 
number of people who cared about the topic. The second International Sum- 
mit on Human Gene Editing in Hong Kong in November 2018 received a bit 
more media attention only because of the surprise announcement of the birth 
of gene-edited twins in China. 

When the next controversy arises, will the scientific community once again 
feel victimized when protesters vilify their work when researchers are only 
trying to help humanity? To be fair, engaging the public can be exhausting. 
It takes up a lot of time that could be spent doing research. It can also be 
frustrating. For anyone even remotely familiar with the climate change debate 
in the US, it is clear some people choose to be willfully ignorant when their 
livelihood, lifestyle, political ideology, or religious beliefs are threatened. So 
public engagement does not promise success. However, skipping it practically 
guarantees failure. 

That said, it is also important not to view public engagement as a bottom- 
less hole of stupidity scientists must attempt to fill with superhuman patience. 
Yes, public opinions can often be painfully uninformed or seemingly illogical, 
but they can also be nuanced, insightful, and absolutely correct. It is also true 
there is a long history of scientists making claims in the past that turned out to 
be wrong, which have trained the public to be skeptical. In a democracy, open 
political debate remains our best method for separating the wheat from the 
chaff and arriving at a current best approximation of reality. 

One proposed alternative to pseudo-public science policy is to create ‘global 
observatories (Jasanoff & Hurlbut 2018) for science controversies, which 
would serve as an information clearinghouse, track developments in the field, 
organize public meetings, and collect public survey data. The new model, 
inspired by the broad work of the International Panel on Climate Change, 
recognizes effective science policy is not a matter of just including a few aca- 
demic bioethicists in a meeting of scientists. Rather, science must be widely 
and continuously discussed, like sports and weather. Compared to the ideal of 
the global observatory, the NSABB, with its narrow mission, may have been 
the wrong model for appropriate technical oversight of the avian influenza 
research debate. 

A less technical alternative approach to expanding the sphere of considera- 
tion in risk management is through literature. The value of literature to improve 
empathy and moral imagination has long been argued and more recently sup- 
ported by research (Nussbaum 1990; Bal & Veltkamp 2013; Kidd & Castano 
2013). Philosopher Hans Jonas called for ‘a science of hypothetical prediction’ 
that acknowledged the impossibility of ‘pre-inventing’ technology, but realized 


88 Dangerous Science 


that the reach of modern technology often exceeded our foresight.* He argued 
one of the best options was to cultivate an imaginative technological skepticism 
and suggested science fiction as a valuable source of inspiration for exploring 
the possible and laying the groundwork for risk management. More recently, 
the idea of ‘science fiction prototyping’ (Johnson 2011) has been proposed to 
create narrative scenarios of technological development that better engage the 
public. In the case of technological risk analysis, science fiction may suggest 
risk management options and provide fertile ground for moral imagination— 
a way to play out future technological scenarios and explore their ethical 
implications. 


Inherent safety 


Having discussed the various stakeholders and how to involve them in the 
science policy process, let us turn to what they should be discussing. A com- 
mon reaction in modern risk management is to increase the resiliency of a 
system in the face of deep uncertainty (Linkov et al. 2014). Physical systems 
can be made more resilient by increasing the redundancy or independence of 
functions. But how do we make a line of research or technology more resilient 
when the potential risk to humanity is widespread or catastrophic? We could 
try to make society itself more resilient to the effects of research and emerg- 
ing technologies, but it is not clear this is possible. It is also a rather unsettling 
and unrealistic approach—discussions of increasing human resiliency through 
population growth or by geophysical barriers, such as underground bunkers or 
space colonies, are not considered serious public policy (yet). For now, a better 
response to dangerous science is to reduce the risk. However, considering the 
apparent inevitability of technological progress, it would seem technological 
risk can only increase. What are we to do? Even if we accept the idea dan- 
gerous science will usually continue until it is discredited or something better 
comes along, we need not accept all research must be inevitable and unfettered. 
Between the extreme positions of outright banning technologies and fatalistic 
permissiveness lie the traditional methods of risk management, which gener- 
ally fail to resolve research controversies. However, a more satisfying moderate 
risk management strategy exists within the concept of inherent safety. 
Conventional risk management generally focuses on reducing the likelihood 
of an accident through safety procedures and equipment. In contract, the prin- 
ciple of inherent safety emphasizes the elimination of the hazard itself. The idea 
was first proposed in the chemical industry in a paper aptly titled, “What you 


* Jonas suggested one of the earliest forms of the precautionary principle: ‘It is the 
rule, stated primitively, that the prophecy of doom is to be given greater heed than 
the prophecy of bliss’ (Jonas 1984). 
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don't have, can’t leak (Kletz 1978). Inherent safety has been described as com- 
mon sense, but not common knowledge. The idea is well-known in the chemi- 
cal and nuclear engineering communities but not by scientists and engineers 
in other areas (Srinivasan & Natarajan 2012). In the nuclear engineering com- 
munity, there is a different emphasis on the basic inherent safety principles due 
to the difficulty of reducing the hazardous material that is fundamental to the 
nuclear power process. Rather than reducing the material, the emphasis is on 
reducing hazardous systems (for example, designing a nuclear power plant that 
is incapable of a reactor core meltdown). 

The distinction between reducing the hazard versus reducing the likeli- 
hood of the bad event has also been termed primary and secondary preven- 
tion, respectively (Hansson 2010a). By selecting safer materials during the 
early design phase, overall risk is reduced far more than by merely adding on 
safety systems or safety procedures as an afterthought. The basic principles of 
inherent safety consist of (1) minimizing the total amount of materials used, 
(2) substituting hazardous materials with alternatives that do not have the 
hazardous quality, (3) using materials under less hazardous conditions, and 
(4) simplifying processes to reduce errors (Kletz 1985; Khan & Amyotte 2003; 
Edwards 2005). 

Another benefit of inherent safety is the ability to simultaneously address 
safety and security concerns. Terrorists are attracted to material hazards that 
can be exploited and safety systems that can be subverted. Inherently safe mate- 
rials and systems discourage malevolent actors. Additionally, inherent safety is 
often simpler and less expensive than standard risk management efforts. Tra- 
ditional approaches incur extra physical, procedural, and administrative layers 
that may include (Möller & Hansson 2008) 


e safety reserves, where reserve capacity is included in the design (e.g., safety 
margins and safety factors); 

e fail-safe design, where a system is designed to fail in a way that minimizes 
damage (e.g., safety barriers that isolate failure); and 

e procedural safeguards (e.g., warning systems, safety audits, and safety 
training). 


Historically, safety has been a secondary concern relegated to engineers at 
the production level instead of a primary concern at the early stages of research, 
when the most impact could be made (Edwards 2005). This is perhaps one 
reason the research community has not embraced the idea of inherent safety. 

Unfortunately, the opportunities for inherent safety in research are only 
slightly more obvious in hindsight. In June 2014, laboratory workers were 
exposed to potentially viable anthrax bacteria at a CDC bioterrorism response 
lab. The subsequent internal investigation report found the exposure arose 
from testing whether a new mass spectrometer could quickly detect B. anthracis 
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(CDC 2014). However, the instrument manufacturer stated the instrument 
could not distinguish between virulent and relatively benign strains of the 
same species. Thus, using a low-risk non-pathogenic strain would have yielded 
the same results. Despite the obvious application of inherent safety, the CDC 
report recommendations still focused on traditional safety management, such 
as revised biosafety protocols and procedures; hazard reduction was mentioned 
only once and in the fifth of eight recommendations. 

The findings of the report also reinforce two previous arguments. First, even 
at facilities with multiple safety barriers and extensive training, unanticipated 
errors do occur. This suggests traditional risk management solutions may offer 
false security by reducing the perception of risk more than the actual likelihood 
of bad events. Second, considering safety in the research design phase can often 
accomplish the same scientific goals while sidestepping controversy. 

Ironically, a primary reason why inherent safety has been slow to catch on 
in industry is risk aversion (Edwards 2005). Most organizations are hesitant 
to fix things that do not appear to be broken—even systems with many near 
misses are considered to be safe until something too egregious to ignore 
occurs (Paté-Cornell & Cox 2014). Neither do organizations want to incur 
the potential risks of deviating from tradition and known practices. These 
same concerns are potential roadblocks to using inherent safety principles in 
dangerous science. By its nature, research is full of new ideas, but researchers 
often use well-established (and peer-reviewed) techniques. Thus, scientists 
may be just as hesitant as industry to incur the extra time and expense needed 
to consider inherent safety principles. Researchers who are focused on work 
efficiency may find the safeguards already in place to be onerous and may 
see any additional requests as superfluous. However, public pressure and 
threats to funding may lead scientists engaged in dangerous science to view 
inherent safety as a reasonable compromise. It is even possible inherently safe 
research could be faster if it allows cumbersome traditional safety systems to 
be avoided. 

Inherent safety, like all human endeavors, is limited by knowledge and crea- 
tivity. Invoking inherent safety first requires the realization a planned course 
of action is potentially unsafe. Likewise, inherently safe alternatives are not 
always obvious and may require considerable innovation. There is also the risk 
a design change may create a new unforeseen risk that did not previously exist. 
To avoid trading one potential risk for another unrecognized risk, it is useful to 
engage multiple perspectives to help offset limitations of imagination. 

Given these caveats, it should be obvious safety is an iterative and never- 
ending process. The chemical industry, where the idea of inherent safety was 
first introduced, still struggles in the quest to minimize its negative impacts 
(Geiser 2015). Ultimately, the primary value of inherent safety is providing a 
complementary philosophical approach to standard probabilistic risk analytic 
thinking, which treats the hazard as a given (Johnson 2000). 
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Application to the H5N1 debate 


While the biosecurity risks discussed in the H5N1 research debate were largely 
theoretical, the biosafety concerns were based on historical data and made a 
compelling case for the use of inherent safety principles. For example, the 1977 
Russian influenza outbreak was an H1N1 strain genetically identical to a virus 
that caused a 1950 epidemic. The lack of mutations in the intervening 27 years 
suggested a frozen virus source—presumably from an unintended laboratory 
release (Webster et al. 1992; Lipsitch & Galvani 2014). Research accidents con- 
tinue to occur even in the most modern and technologically advanced research 
facilities. A review of US reports of theft, loss, or release of select agents and 
toxins compiled by the CDC between 2004 and 2010 found no reports of theft, 
88 reports of loss, and 639 cases of accidental release—14 of which occurred at 
BSL-4 facilities (Henkel, Miller & Weyant 2012). 

While inherently safe research seems like commonsense, it is not emphasized 
in biosafety or biosecurity guides, which focus on formalized processes and 
training. The continued emphasis on traditional methods is unfortunate given 
the poor record of implementing such measures. For example, the European 
Committee for Standardization’ CWA 15793 framework for lab biosafety was 
adopted in 2008, but by 2013 only one third of the European Biosafety Asso- 
ciations 118 members were using the framework and 15 percent were una- 
ware of its existence (Butler 2014). Likewise, there has been insufficient effort 
to integrate biosecurity considerations into the culture of the life sciences. An 
informal survey of 220 graduate students and postdoctoral fellows at top NIH- 
funded US institutions found 80 percent of the respondents had bioethics or 
biosafety training, while only 10 percent had biosecurity training (Kahn 2012). 
Some theories for the dearth of biosecurity training include a lack of perceived 
relevance, a lack of familiarity, and a general reluctance to acknowledge life 
science research could be used to cause harm. Perhaps the largest issue is the 
general lack of awareness of biosecurity issues in the life science community 
(National Academies 2017a). It has been suggested the culture of safety in the 
life sciences could be most quickly improved by replicating the established 
models and hard-earned lessons of other fields (e.g., nuclear engineering) 
to create ‘high-reliability organizations’ (Trevan 2015; Perkins, Danskin & 
Rowe 2017). 

Even after a string of incidents in the summer of 2014 placed renewed focus 
on biosafety, subsequent events suggested appropriate laboratory safeguards 
had yet to be implemented. In December 2014, it was discovered live Ebola 
virus was sent to the wrong lab within the CDC. In May 2015, it was discov- 
ered a US Army lab in Utah had likely shipped live, rather than inactivated, 
B. anthracis samples to approximately 30 US and foreign labs over several years. 
This suggested reducing human error was difficult even when biosafety issues 
were in the headlines. 
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Early in the debate, critics cited the 1947 Nuremberg Code, a foundational 
bioethics document, to argue most gain-of-function research on potentially 
pandemic pathogens was not ethically defensible because the same benefits 
could be attained by safer alternatives (Lipsitch & Galvani 2014). Specifically, 
the two main goals of these experiments, guiding vaccine development and 
interpretation of surveillance data, can be achieved by other methods, such as 
molecular dynamical modeling, in vitro experiments that involve single pro- 
teins or inactive viral components, and genetic sequence comparison studies 
(cf. Russell et al. 2014). Likewise, the goal of reducing influenza pandemics 
could be more safely pursued by research on universal flu vaccines (Impa- 
gliazzo et al. 2015), broad-spectrum antiviral drugs (Shen et al. 2017), and 
improved vaccine manufacturing. All these alternatives fall within the general 
concept of more inherently safe research, although there is debate over the 
viability of some avenues, such as universal flu vaccine research (see Cohen 
2018; Nabel & Shiver 2019). Ultimately, the narrow set of questions that can 
only be answered by gain-of-function research may have limited public health 
benefits (Lipsitch 2018). 

While the initial HHS guidelines for regulating gain-of-function research 
did include the concept of inherent safety in its third criterion (Patterson et al. 
2013), the discussion among scientists and regulators was focused on traditional 
risk management. However, suggestions to incorporate inherent safety slowly 
gained support as the debate progressed. Eventually, even Dr. Kawaoka, the lead 
scientist for one of the original H5N1 papers, conceded some research could use 
alternative techniques, such as loss-of-function studies, less pathogenic viruses, 
or analyses of observable traits (NRC 2015). This is an admirable evolution of 
perspective considering researchers spend substantial effort to become experts 
in particular techniques; one should not expect them to instantly abandon their 
prior work when potentially better options are presented. 

Because the application of inherent safety is still relatively new to the bio- 
sciences, there are some special considerations. First, the principles of inher- 
ent safety are more restrictive than when applied to the fields of chemical and 
nuclear engineering because of the unique ability of biologically hazardous 
materials to replicate. Thus, inherent safety may require a hazardous organ- 
ism be eliminated, rendered unable to reproduce, or have survival limitations 
imposed. This last approach has been pursued in synthetic biology through 
genetically modified organisms that only survive in the presence of an anthro- 
pogenic metabolite, such as a synthetic amino acid (Mandell et al. 2015). 

That said, many applications of inherent safety in the biosciences do not 
require such extreme innovation. For example, certain types of attenuated 
viruses were found to have the potential to recombine into more virulent 
forms when the same livestock population was vaccinated with two different 
forms of the same vaccine (Lee et al. 2012). Traditional biosafety risk man- 
agement would suggest stricter policies and processes are needed to prevent 
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double vaccinations. Meanwhile, an inherent safety approach would attempt to 
reformulate the vaccine using an inactivated rather than an attenuated virus to 
remove the potential hazard altogether.* 


Improving Risk-Benefit Assessments 


The second chapter of this book compares several approaches to valuing the 
benefits of research to show how they can yield disparate assessments. Like- 
wise, the third chapter details some of the many value assumptions inherent 
in risk assessments that prevent claims of objectivity. These lines of argument 
help explain why risk-benefit analysis is more appropriate for evaluating well- 
defined problems with copious data but has limited utility for resolving debates 
over dangerous science where uncertainty is high and data are sparse. Does this 
mean risk-benefit analysis has no place in assessing and managing dangerous 
science? No, it means a risk-benefit assessment, as commonly used to assess 
dangerous research, is ineffective as a primary decision criterion. However, we 
can considerably improve its utility by following a few recommendations. 


Recommendation 1: Change expectations 


While the purpose of risk assessment is understood in theory, it is less so in 
practice (Apostolakis 2004). The effort of generating a time-consuming and 
costly risk-benefit assessment often carries an expectation the results will trans- 
late into a decision. However, expecting all risk-benefit assessments to yield 
clear-cut, consensus-building answers is unrealistic. The controversies sur- 
rounding some lines of research are based on fundamental disagreements over 
ethics and appropriate public risk that will not change due to the findings of 
a formal risk assessment. Certain issues are more amenable to analysis than 
others for reasons already noted. Uncontroversial and well-defined questions 
can essentially use an assessment as a decision tool, but more controversial and 
complex questions should treat a risk-benefit assessment as a risk exploration 
tool. While this may seem unhelpfully vague, it simply means once an assess- 
ment itself becomes a source of controversy, more data will probably not resolve 
disagreement. Controversy is a sign to switch expectations from decision cri- 
teria to risk exploration tool. In fact, a thorough risk assessment may actually 
increase controversy because it uncovers previously overlooked risks. 


5 Vaccine development has also been suggested as a more inherently safe method of 
combating antimicrobial resistance in common pathogens. Antibiotics are difficult 
to develop and lose effectiveness over time as resistance spreads, while vaccines are 
easier to develop and vaccine resistance is rare (Rappuoli, Bloom & Black 2017). 
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Furthermore, any request to conduct a risk assessment will not be accom- 
panied by a well-defined procedure because there is no consensus on what 
constitutes the proper accounting of benefits or what underlying value assump- 
tions should be used to estimate risk. By concentrating on the correctness of 
an assessment, stakeholders merely end up creating a secondary debate. The 
simple solution is to make sure that all stakeholders—policymakers, analysts, 
scientists, and the public—understand formal risk-benefit assessments inform 
the decision process but are not the decision process itself. A quality risk- 
benefit assessment should provide insight within the broader scientific and 
political debate. 

The H5N1 research debate serves as a classic example of when to forego the 
traditional use of risk assessment as a decision tool. Before the analysis was 
conducted, expert opinion on the annual likelihood of a research-induced pan- 
demic ranged from better than 1 in 100 to less than 1 in 10 billion. This wild 
disparity among opposing experts guaranteed the outcome of any single formal 
risk assessment was also going to be contentious. Likewise, there was no con- 
sensus in the scientific community regarding the basic value of the research. 
Some scientists claimed the work was critical to public health and relatively 
low-risk (Morens, Subbarao & Taubenberger 2012; Palese & Wang 2012), while 
others claimed the approach had no predictive ability useful to pandemic pre- 
paredness and presented an unnecessary public health risk (Mahmoud 2013; 
Rey, Schwartz & Wain-Hobson 2013). Wisely, some individuals raised doubts 
early on that a risk-benefit assessment would resolve the debate (Uhlenhaut, 
Burger & Schaade 2013; Casadevall & Imperiale 2014). The NSABB, which 
had formerly noted risk-benefit assessments were subjective (NSABB 2007), 
originally discussed the H5N1 assessment in the terms subjective and objective 
but later reframed its expectations by switching to qualitative and quantitative. 
Anyone who went into the process believing a risk-benefit assessment was a 
consensus-building tool was bound for disappointment. 


Recommendation 2: Use broad uncertainty assumptions 


Uncertainty is inherent to all risk-benefit assessments, and its characterization is 
critical to an assessment’s utility. Even for risk-benefit assessments that address 
fairly narrow questions, it is best to use a broad conception of uncertainty. This 
includes acknowledging not all uncertainty can be represented probabilistically 
without making important assumptions, dependencies among parameters are 
not always known, and the correctness or completeness of the model is often 
ambiguous. The value assumptions roadmap outlined in the third chapter can 
be used as a guide. Even in a highly quantitative assessment, broader notions of 
uncertainty can still generate informative results; more importantly, the results 
will not claim certainty that does not exist. 
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Recommendation 3: Use multiple methods 


When resources allow it, assessments will benefit from using multiple tech- 
niques developed from risk analysis. Some examples would be to use both a 
Monte Carlo simulation and probability bounds analysis together or to have 
a narrow quantitative assessment accompanied by a broad qualitative assess- 
ment. The work in the second and third chapters provide guidance on the range 
of methods available. The value of multiple methods is to give the proponents 
of various metrics and statistical techniques (based on their epistemic prefer- 
ences) a chance to feel confident about the results of the assessment. Likewise, 
individuals without any methodological preference will be comforted by the 
thoroughness of the analysis and the range of analytic frameworks. 


Recommendation 4: Use the analysis to design better research 


If analysts follow the first three recommendations, a risk-benefit assessment 
will be better suited for exploring risk. This can help researchers rethink their 
work to accomplish their research goals by inherently safer or more ethical 
means. Ideally, the best opportunity to resolve a debate over dangerous science 
is to apply the inherent safety design concept to avoid the debate altogether. 

As previously mentioned, this is no easy task. It requires a new mindset and 
a cultural shift within the communities of scientists and engineers. If we are to 
embrace the idea of inherent safety, the conception of every research project 
must start with a foundation of humility onto which we build our clever ideas. 
Whenever we ask, ‘Can this be done?’ we must also ask, “Will it fail safely when 
something goes wrong?’ 


Science policy done better: Gene drives 


Given the many examples in which poor science policy seems to be failing 
us, it is useful to look at a positive example of science risk assessment and 
management. One recent example is the work on gene drives. The basic con- 
cept of a gene drive is a genetic mutation that is nearly universally inherited 
and can spread quickly through a species. While they are known to naturally 
occur in some species, bioengineers now have the ability to create gene drives 
that accomplish specific tasks. While a gene drive could be used to acceler- 
ate a genetic study in the lab, a bolder and more controversial application of 
gene drives is to eliminate vector-borne diseases by mutation or eradication 
of their host species. Given the potential danger of such efforts, some scien- 
tists involved have been proactive about their work and have sought formal 
risk assessment and public input before proceeding too far. This has been a 
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challenge considering the speed with which the science is developing. The first 
suggestion of a CRISPR-Cas9-mediated gene drive for species alteration was 
published in 2014 (Esvelt et al. 2014). A paper demonstrating an (inadvertent) 
CRISPR gene drive in laboratory fruit flies was published the following year 
(Gantz & Bier 2015), anda National Academies report on gene drive policy was 
published in 2016 (National Academies 2016). 

Gene drives appear to be a particularly powerful biotechnology with endless 
ecological applications. A few proposals include making white-footed mice, 
the primary vector for Lyme disease, immune to the bacterium; eradicating all 
invasive rats, possums, and stoats from New Zealand; and eradicating invasive 
spotted knapweed and pigweed from the US. However, the most frequently 
discussed use is the mutation or elimination of some species of mosquito. 

The benefit to humans of eliminating mosquitoes is obvious. They are the pri- 
mary vector for a multitude of diseases, including malaria, yellow fever, dengue 
fever, West Nile virus, chikungunya, Zika, and several types of viral encephali- 
tis. According to WHO estimates, malaria alone results in approximately half 
a million deaths (mostly children) each year. However, any proposal to drive a 
species to extinction is inherently controversial. While many biologists believe 
the ecological niche occupied by mosquitoes would be quickly filled by other 
insects (Fang 2010), mosquitoes provide an ecological service as pollinators 
and are an important food source for many species (Poulin, Lefebvre & Paz 
2010). The substantial biomass mosquitoes comprise in some arctic and aquatic 
environments hints at the potential impact of their absence. Likewise, the con- 
cern a gene drive mechanism might be transferred to another species and drive 
an unintended target to extinction is reason enough to proceed with great cau- 
tion and has already prompted formal risk-benefit assessments (e.g., Hayes 
et al. 2015). 

If we view such reports with the ultimate purpose of spurring us to inno- 
vate more inherently safe research and technologies, we can see the value of 
gene drive risk assessments. Initial discussions centered on using a gene drive 
engineered to destroy X chromosomes during sperm development (Galizi et 
al. 2014). This results in a male-dominant trait that is passed on to subsequent 
generations and eventually drives the species to extinction when there are no 
more females to reproduce. Since then, the self-perpetuating extinction drive 
has become even more efficient with extinction likely within a dozen genera- 
tions (Kyrou et al. 2018). However, other options that might have fewer unin- 
tended consequences have also emerged. Some of these less drastic suggestions 
include genetically engineering mosquitoes to be immune to the malaria par- 
asite (Gantz et al. 2015); using two strains of the common bacterial parasite 
Wolbachia to infect male and female mosquitoes, which prevents successful 
reproduction; creating male mosquitoes that carry a gene that prevents off- 
spring from maturing (Phuc et al. 2007; Windbichler et al. 2007); or using Wol- 
bachia to prevent disease transmission by female mosquitoes. The last three 
alternatives have already been tested in the real world with promising results 
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but have very different risk-benefit profiles. The genetic modification solution 
is self-limiting, reduces the overall mosquito population, and requires expen- 
sive redeployment. Meanwhile, the second Wolbachia solution is self-perpetu- 
ating, maintains the local mosquito population, and presently exists in nature 
(Servick 2016). The public is already starting to see these alternatives as prefer- 
able to the status quo of using insecticides, which have considerable ecological 
impacts and efficacy issues as resistance develops. 

Conversely, there is a growing concern among gene drive researchers that 
the technology is not powerful enough. The CRISPR-Cas9 system is known 
to occasionally make random DNA additions or subtractions after a cut. If the 
cut is sown back together before substitution, then it is no longer recognized 
as a gene splicing site. Likewise, natural genetic variation within species means 
some individuals within the target species will be immune to the gene drive sub- 
stitution (Hammond et al. 2016). Over time, resistance to a gene drive should 
be naturally selected. Discovery of a variety of anti-CRISPR proteins (Acrs) 
widely found in bacteria and archaea also suggests there are natural defenses to 
the technology. Early work using mouse embryos has suggested constructing a 
consistent gene drive in mammals is more difficult than in insects but nonethe- 
less achievable (Grunwald et al. 2019). 

However, biologists Keven Esvelt, an originator of the CRISPR-mediated gene 
drive concept, argues mathematical modeling of first-generation gene drives 
demonstrates they are highly invasive in nature and need to be modified before 
even field trials should proceed (Esvelt & Gemmell 2017). Compounding the 
complexity of the gene drive discussion is the wide range of potential uses. An 
aggressive gene drive may be desirable if the intent is to rid a remote island of a 
destructive invasive species but is a serious cause for concern if introduced, for 
example, in a marine environment where there is high potential for the drive 
to spread globally. 

The work of Dr. Esvelt and colleagues is particularly praiseworthy in its fore- 
sight to proactively address the potential dangers of gene drives for unintended 
consequences or even for weaponization by malevolent actors. Some of these 
precautions include creating secondary anti-gene drives that can undo other 
drives (albeit imperfectly), developing gene drives split into two parts that are 
not always inherited together to slow the process, or using gene drives sepa- 
rated into even more distinct components that operate in a sequential daisy- 
chain but eventually burn out as stages of the chain are lost through successive 
generations. 

These technical safeguards are accompanied by perhaps an even more 
important contribution to science policy: pushing the scientific community 
toward open discussions among researchers and the public. Much of Dr. 
Esvelt’s work involves sincere dialogue with potential test communities (Yong 
2017) and fellow scientists in what amounts to a pragmatic version of ‘real- 
time technology assessment’ (Guston & Sarewitz 2002). Dr. Esvelt’s efforts are 
backed by the National Academies report on gene drives, ‘Experts acting alone 
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will not be able to identify or weigh the true costs and benefits of gene drives’ 
(National Academies 2016). This is a difficult undertaking given the existing 
competitive structure of science funding, which unintentionally encourages 
keeping secrets until publication is ready. While the gene drive research 
community remains small, cooperation is possible. It remains to be seen if the 
growing field will eventually revert to standard scientific secrecy and the dangers 
that go with it. Furthermore, a primary source of funding for early-stage gene 
drive research has been the US military (Defense Advanced Research Projects 
Agency), where open public review does not come naturally. 

The ultimate success of Dr. Esvelt’s efforts to reform science remain to be 
seen. However, regardless of whether one considers him to be on the vanguard 
of a new era of open public science or just the latest in a long line of quixotic 
warriors tilting against the unyielding edifice of science culture, it is impor- 
tant to understand he is absolutely correct. Hopefully, this book provides the 
convincing detailed arguments for why scientists and engineers should follow 
his lead. 


Final Thoughts on Dangerous Science 


A primary theme of this book is no detailed risk-benefit methodology to assess 
dangerous science exists because no consensus can be found regarding exactly 
how to categorize and quantify the risks and benefits of any controversial sci- 
ence program. The various value judgments inherent in the process tend to yield 
an assessment that is controversial itself. The practical advice here is to simply 
temper expectations of risk-benefit assessment as a quantitative decision tool: 
it is better viewed as a tool for exploring and communicating the social impact 
of research and technology. The discussion of the practical and philosophical 
difficulties underlying risk-benefit assessment in this book should help sci- 
entists and engineers perform assessments that better clarify specific areas of 
consensus and disagreement among researchers, the public, and policymakers. 
Likewise, formal assessments should not be used as a stalling tactic to distract 
concerned citizens. 

Another theme is that in the absence of a convincing assessment, pre-existing 
technological risk attitudes guide management decisions. I note a general trend 
of permissive management of dangerous science and suggest, when possible, 
redesigning research to avoid controversy is the most viable way forward. Les- 
sons learned from the case study of controversial H5N1 avian influenza virus 
research can be quite useful when presented to a larger audience. The research 
community is not monolithic; insight from past events can be quickly forgot- 
ten, and the best practices in one field are often ignored by other fields that face 
similar problems but have little interaction. The following are a few additional 
lessons we can glean from the H5N1 debate. 
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Post hoc management is difficult 


Once funding has been obtained and work has begun, researchers have more 
than a purely intellectual position regarding a line of research. Funding creates 
an immediate personal financial interest as well as a long-term career impact. 
As Upton Sinclair said, ‘It is difficult to get a man to understand something 
when his job depends on not understanding it? Once work has begun, the 
investment of time and effort creates an emotional attachment. Researchers 
will defend their work for reasons that may have more to do with personal 
reputation and feelings of ownership than with the merits of the work itself. 
Expecting researchers to self-govern their work is asking a lot of even the most 
well-intentioned scientists. 

Furthermore, post hoc management options are limited. Both openness 
and secrecy approaches to science dissemination can have unintended con- 
sequences (Lewis et al. 2019). For potentially dangerous research, censorship 
is commonly proposed. However, the ubiquity of modern communications 
technology makes effective censoring increasingly difficult. Moreover, past 
censorship, mostly in the form of classification for national security pur- 
poses, has an unpleasant history of shielding scientists from essential public 
oversight (Evans 2013). Openness can reduce ethical issues when researchers 
become less inclined to be associated with research that is publicly criticized. 
It is also unclear whether restricting communication is a useful technique for 
reducing technological risk. Open debate engages multiple perspectives and 
improves the likelihood potential accidents or unintended consequences will 
be discovered. 

Conversely, once research is published, other scientists may attempt to 
replicate the work or use the results to perform similar research, thereby 
increasing the overall risk. If the work is potentially dangerous, publication 
may be encouraging other laboratories to also perform dangerous work. It 
would seem advances in science, yielding increasingly powerful technology, 
challenge our ability to maintain a free society. However, many of these issues 
do not arise when considering proactive regulation at the funding stage.° 


€ Of course, post hoc management is sometimes unavoidable. The discovery of the 
first new Clostridium botulinum toxin in 40 years prompted a redacted 2013 paper 
in the Journal of Infectious Diseases due to dual-use concerns. This was a major event 
because the botulism-causing neurotoxic proteins are the most lethal substances 
known—fatal doses are measured in nanograms. Later it was discovered the new 
toxin responded to existing antitoxins and the full details were published. Propo- 
nents of the US policy on dual use research of concern believed this was an example 
of the policy working. Critics argued public health was risked by slowing down the 
dissemination of important information to other researchers who could have dis- 
covered the existing treatment faster. 
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Research controversies are predictably unpredictable 


Research controversies are like snowflakes: unique in detail, but identical 
from a distance. The principle of dangerous science outlines a general trend 
in research controversies. With few exceptions, a line of ethically questionable 
or potentially dangerous research will continue until it has been found to be 
unproductive or inferior to an alternative. However, while there are general 
trends in the evolution of research controversies, the outcomes are contingent 
upon the specific technical details, utility, and viable alternatives. In this regard, 
the principle of dangerous science has limited predictive power. However, pre- 
dictive power is not the only valuable characteristic of a theory.” Rather, the 
principle is a simple framework for interpreting a situation that frequently 
arises in science policy and technology assessment. 


Dangerous Science abounds 


The avian flu gain-of-function research debate is a prime example of low- 
embodiment, low-governability science and technology. However, other exam- 
ples are plentiful. For instance, cybersecurity research has even lower material 
requirements (a cheap computer and internet access) and represents a potential 
threat that is almost exclusively information-driven. In an increasingly inte- 
grated world, critical infrastructure (e.g., utilities and transportation systems) 
are at risk of disruption. The potential economic and national security risks 
only increase as more systems become automated and interconnected. It is no 
surprise the field of cybersecurity is experiencing explosive growth. Cyberwar- 
fare is also full of unique ethical situations. For example, the malicious code 
Stuxnet, which is believed to have destroyed approximately 1000 Iranian ura- 
nium enrichment centrifuges, has been argued to be the first highly targeted 
ethical weapon of war (Singer & Friedman 2014). 

This in no way minimizes the continued importance of traditional material- 
centric dual-use policy. The old threats have not disappeared, but they must 
now vie for our attention among a growing crowd of technological risks. This 
discussion of the role of risk-benefit analysis also applies to technologies with 
high material embodiment. For example, a relatively new uranium enrichment 
technique using lasers was lauded by the commercial nuclear power industry 
but created considerable alarm in the nuclear non-proliferation community 
because it would make uranium enrichment much easier. When the US Nuclear 
Regulatory Commission (NRC) licensed this technology in 2012, opponents 


7 For example, game theory, originally lauded for describing economic systems, has 
lost some of its luster in recent years due to its lack of predictive power (Rubinstein 
2006). Nonetheless, it still remains a valuable framework for analyzing certain prob- 
lems and has generated useful niche applications, such as evolutionary game theory. 
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criticized the commissions narrow conception of risks—considering only the 
physical security of an enrichment facility while ignoring the broader societal 
implications of encouraging the technology.* 

Likewise, entirely new fields of controversial material-centric research are 
emerging. For example, the current debate over using geoengineering to miti- 
gate climate change has raised concerns regarding unintended climate conse- 
quences and the moral hazard of using geoengineering as an excuse to delay 
implementing more permanent and difficult mitigation strategies. 


Where do we go from here? 


The assessment and management of dangerous science is situated within a 
larger group of questions. When controversial research or technology becomes 
publicly known, questions are invariably asked regarding who funded the 
work, why it was allowed, and so on. Considering the examples of controversial 
research programs previously discussed, an important question arises: in a free 
and open society, how, if at all, should we control science research and technol- 
ogy development? To adequately address this question, we are led to a series of 
secondary issues partially addressed here: who should make the decisions (as 
many stakeholders as practicable), what form should oversight take (eliminat- 
ing the hazard is preferred to using safety equipment and procedures), and at 
what stage should controls be established (as early as possible)? However, the 
implementation of ideal science policy remains elusive. 

The most dangerous emerging and existing technologies are the ones we do 
not question. The long history of synthetic chemicals and products no longer 
commercially available due to recognized public health risks attests to the value 
of proactively assessing risk’—science research is no different. When scientists 
rush headlong into their research, often propelled by the competitive urge for 
first publication, we may be unpleasantly surprised on occasion. Our saving 
grace is most scientists spend a great deal of time considering their research and 
generally try to be deliberate and thoughtful in their work. However, despite 
the obvious need for scientists to consider the social and ethical impacts of 
their research, they are often actively discouraged from doing so. For example, 
when research on the de novo synthesis of the poliovirus was published in 2002, 
the editors of Science insisted on removing any discussion of ethical and social 
implications from the final article (Wimmer 2006). Without any reassurance 


£ The NRC’s potential bias toward industry-friendly limited risk analysis has been 
noted elsewhere (Jaczko 2019). 

° The widespread use of microbeads is a good example. It took many years to recog- 
nize the negative health and environmental impacts of micro-plastics and then to 
ban their use in personal care products. Some basic forethought could have pre- 
vented a lot of unnecessary harm. 
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that thoughtful humans were conducting the research, it is no surprise the pub- 
lic assumed the worst. 

That said, self-reflection is not enough. It is unreasonable to ask scientists 
and engineers to advance their field and their careers while also dispassionately 
evaluating the potential social impacts of their work. They know a lot about 
their research and thus often have the most informed and nuanced opinions 
on its implications. However, their knowledge is deep but narrow and biased by 
personal interests. It may be better than most, but it is not enough. 

The old fantasy of detached scientists working in ivory towers falls apart in 
light of modern technology. It is increasingly possible that a few highly trained 
and well-funded scientists could, for example, regionally wipe out an entire 
species. While this might be a reasonable idea in specific cases, the gravity of 
such a solution requires a thorough exploration of the risks and implications. 
Just as you cannot un-ring a bell, some science is irreversible (although the de- 
extinction movement would argue otherwise). We should be sure of not only 
our intentions, but also the consequences of our actions. 

If unquestioned science is the most dangerous science, then open discussion 
involving many stakeholders is the solution. Many opinions from diverse back- 
grounds generally improves risk assessment. This is one of the best reasons for 
public funding of research and development. When the public funds research, 
more assessment and oversight is likely at the early stages when thoughtful dis- 
cussion can be most productive. This is not to say privately funded research 
is necessarily dangerous, but public funding at least gives us a better chance 
of having a say in what direction science is heading. Given the power of sci- 
ence and technology in modern society, engaged public oversight is an essential 
requirement of any truly functional democracy. 


Afterword 


Substantial portions of this book’s critique of formal risk-benefit assessment 
have been rather academic and theoretical, so let us end with some humbling 
real-world examples of why we need robust public conversations about science 
and technology policy. The following are ten informal assessments of various 
technologies made in the last century. All the assessments are made by accom- 
plished scientists, engineers, or inventors speaking in their field of expertise or 
about their own work. This list is not intended to single out any individuals for 
ridicule—imperfect foresight is a universal affliction. Rather, the point here is 
to remind ourselves individual experts often lack the emotional distance, range 
of experience, or even the time to fully imagine and consider the consequences 
of their creations. 


‘When my brother and I built and flew the first man-carrying flying 
machine, we thought we were introducing into the world an inven- 
tion which would make further wars practically impossible. That we 
were not alone in this thought is evidenced by the fact that the French 
Peace Society presented us with medals on account of our invention. 
We thought governments would realize the impossibility of winning by 
surprise attacks, and that no country would enter into war with another 
of equal size when it knew that it would have to win by simply wearing 
out the enemy: 
—Orville Wright, June 1917 letter discussing 
the use of airplanes in WWI 
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‘So I repeat that while theoretically and technically television may be 

feasible, yet commercially and financially, I consider it an impossibility; 
a development of which we need not waste little time in dreaming’ 

—Lee de Forest, inventor and pioneer of 

radio and film technologies, 1926 


“There is not the slightest indication that [nuclear energy] will ever be 
obtainable. It would mean that the atom would have to be shattered 
at will? 

- Albert Einstein, quoted in the Pittsburgh Post-Gazette in 1934 


“There is practically no chance communications space satellites will be 

used to provide better telephone, telegraph, television or radio service 
inside the United States’ 

- T.A.M. Craven, Navy engineer, radio officer and 

Federal Communications Commission (FCC) 

commissioner, in 1961 


‘Cellular phones will absolutely not replace local wire systems. Even if 

you project it beyond our lifetimes, it wont be cheap enough’ 
- Marty Cooper, Motorola engineer and developer of the first cell 
phone, in 1981 Christian Science Monitor interview 


‘I predict the Internet will soon go spectacularly supernova and in 1996 
catastrophically collapse? 

- Robert Metcalfe, co-inventor of Ethernet and 

founder of 3Com, writing in 1995 


“The subscription model of buying music is bankrupt. I think you could 

make available the Second Coming in a subscription model, and it 
might not be successful? 

- Steve Jobs, co-founder of Apple, in a 2003 

Rolling Stone interview 


‘Spam will soon be a thing of the past? 
- Bill Gates, co-founder of Microsoft, claimed spam would be 
solved in two years according to a January, 2004 
BBC interview at the World Economic Forum 
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“The probability of having an accident is 50 percent lower if you have Auto- 
pilot on. Even with our first version, it’s almost twice as good as a person: 
- Elon Musk, CEO of Tesla, in April 2016 referring to Tesla’s 
autonomous driving software only months before an 

Autopilot-driven Tesla drove into the side of a 

turning cargo truck, resulting in the world’s 

first self-driving vehicle fatality. 


“I think the idea that fake news on Facebook, of which is a very small 

amount of the content, influenced the election in any way is a pretty 
crazy idea? 

- Mark Zuckerberg, CEO of Facebook, in a November 2016 

Techonomy conference interview referring to the 2016 

US presidential elections 
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The public is generally enthusiastic about the latest science 
and technology, but sometimes research threatens the physical 
safety or ethical norms of society. When this happens, scientists 
and engineers can find themselves unprepared in the midst of 
an intense science policy debate. In the absence of convincing 
evidence, technological optimists and skeptics struggle to find 
common values on which to build consensus. The best way to 
avoid these situations is to sidestep the instigating controversy 
by using a broad risk-benefit assessment as a risk exploration 
tool to help scientists and engineers design experiments and 
technologies that accomplish intended goals while avoiding 


physical or moral dangers. 


Dangerous Science explores the intersection of science policy and 
risk analysis to detail failures in current science policy practices 
and what can be done to help minimize the negative impacts of 


science and technology on society. 
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